Multipart/Mixed and Compound Documents

Nathaniel,

It occurs to me that there is a fairly subtle assumption being
made in the multipart/mixed spec that would cause at least some
minor incompatibilities between Slate and Andrew, or any other
compound document system using MIME.

I believe the basic difference is this (forgive me if I'm not
completely up on Andrew):

1. Andrew's basic model is a text structure with embedded objects
   (I realize it is in fact more general than this, but I believe
   that's how effectively all documents are created).  The embedded
   objects are equivalent to character glyphs in a paragraph.

2. Slate's basic model is of a hierarchy of objects (although the
   hierarchy is only used for list structure, not for general embedding
   of objects).  A typical document without numbered or bulleted lists
   is just a sequence of objects, e.g. a paragraph, a spreadsheet, an
   image, etc.  Objects can be embedded inline in a paragraph, but that's
   not the normal way documents are created.  There's of course lots of
   other document structure (style sheets, page descriptions, etc) but we
   can ignore that for the purposes here.

So, consider a Slate document being converted to multipart/mixed.
I have a paragraph, an image and an audio object.  I encode this as

multipart/mixed
        text/plain
        image/gif
        audio/basic

In converting this to Andrew, you can use the way that the text/plain
object ends (extra line or not) to determine whether the image appears
inline with the text, but you're always going to assume that the audio
object appears inline with the image.  However, that's not the way it
appeared in Slate.  To get that effect in Andrew, I'd have to generate

multipart/mixed
        text/plain      (ending with an extra CRLF)
        image/gif
        text/plain      (empty)
        audio/basic

Effectively every distinct object would need to be preceeded by a
text object in order to indicate that it appears separately.  The only
part of the MIME spec that even partially addresses this is the Note that says
that "Body parts that must be considered to end with line breaks, therefore,
should have two CRLF's preceding the encapsulation line, ..."  However,
clearly an image and audio don't deal with the concept of "line breaks" -
that only applies to text.

Also, it's clear that you really haven't accomplished in-line objects
without discussing how two text passages, separated by some other type
of object, are merged.  This clearly applies most critically to something
like richtext, where some context (such as paragaph justification)
might be in effect.  Furthermore, nowhere in the spec does it treat the
text type as special - yet we're effectively trying to provide at least one
level of object nesting for text but not discussing it for any other object 
type.

I realize this opens up the can of worms called compound document models.
The simplest approach would be to punt on the inline objects (since we
really haven't effectively dealt with it).  Then the extra newline at the end
of the text just determines whether there is white space after the paragraph and
before the next object.

Any thoughts?

Terry