ietf-822
[Top] [All Lists]

RFC 2046 message/partial (sect. 5.2.2) clarification

2003-06-08 13:22:22

RFC 2046 section 5.2.2.1 contains the following text:

   The semantics of a reassembled partial message must be those of the
   "inner" message, rather than of a message containing the inner
   message.  This makes it possible, for example, to send a large audio
   message as several partial messages, and still have it appear to the
   recipient as a simple audio message rather than as an encapsulated
   message containing an audio message.  That is, the encapsulation of
   the message is considered to be "transparent".

   When generating and reassembling the pieces of a "message/partial"
   message, the headers of the encapsulated message must be merged with
   the headers of the enclosing entities.  In this process the following
   rules must be observed:

    (1)   Fragmentation agents must split messages at line
          boundaries only. This restriction is imposed because
          splits at points other than the ends of lines in turn
          depends on message transports being able to preserve
          the semantics of messages that don't end with a CRLF
          sequence. Many transports are incapable of preserving
          such semantics.

    (2)   All of the header fields from the initial enclosing
          message, except those that start with "Content-" and
          the specific header fields "Subject", "Message-ID",
          "Encrypted", and "MIME-Version", must be copied, in
          order, to the new message.

    (3)   The header fields in the enclosed message which start
          with "Content-", plus the "Subject", "Message-ID",
          "Encrypted", and "MIME-Version" fields, must be
          appended, in order, to the header fields of the new
          message.  Any header fields in the enclosed message
          which do not start with "Content-" (except for the
          "Subject", "Message-ID", "Encrypted", and "MIME-
          Version" fields) will be ignored and dropped.

    (4)   All of the header fields from the second and any
          subsequent enclosing messages are discarded by the
          reassembly process.

and section 5.2.2.2 gives an example:

   If an audio message is broken into two pieces, the first piece might
   look something like this:

     X-Weird-Header-1: Foo
     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail (part 1 of 2)
     Message-ID: <id1(_at_)host(_dot_)com>
     MIME-Version: 1.0
     Content-type: message/partial; id="ABC(_at_)host(_dot_)com";
                   number=1; total=2

     X-Weird-Header-1: Bar
     X-Weird-Header-2: Hello
     Message-ID: <anotherid(_at_)foo(_dot_)com>
     Subject: Audio mail
     MIME-Version: 1.0
     Content-type: audio/basic
     Content-transfer-encoding: base64

       ... first half of encoded audio data goes here ...

   and the second half might look something like this:

     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail (part 2 of 2)
     MIME-Version: 1.0
     Message-ID: <id2(_at_)host(_dot_)com>
     Content-type: message/partial;
                   id="ABC(_at_)host(_dot_)com"; number=2; total=2

       ... second half of encoded audio data goes here ...

   Then, when the fragmented message is reassembled, the resulting
   message to be displayed to the user should look something like this:

     X-Weird-Header-1: Foo
     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail
     Message-ID: <anotherid(_at_)foo(_dot_)com>
     MIME-Version: 1.0
     Content-type: audio/basic
     Content-transfer-encoding: base64

       ... first half of encoded audio data goes here ...
       ... second half of encoded audio data goes here ...

RFC 2298 amends 2046 by adding Original-Recipient, Disposition-Notification-To,
and Disposition-Notification-Options to the list of fields including Message-ID.
For convenience, I'll refer to the set of fields named above (as amended) plus
MIME-content fields as F.

The RFC 2046 text isn't exactly clear about fragmentation and original
message header fields, and the example doesn't show the original message
fields.

Presumably the original message was RFC 822 compliant and therefore had a
Date field and at least one originator field and at least one recipient field.
Those fields are not in F, nor do they appear in the "fields" in the first
piece's message body. However the first piece body does contain X-Weird-Header-1
and X-Weird-Header-2 which are not in F but have neither been copied to the
enclosing message header nor elided. So the first question is: which fields
from the original message (if any) are to be elided from the "fields" in the
first piece body when fragmenting a large message?

Rule 2 refers to "the initial enclosing message" and to "the new message",
which in the case of generating the first piece appear to refer to the
same message and therefore that rule has a null effect in that case. Rule
3 provides for copying the set of fields F (and no others) from the
original message to the new message. That raises a second question: is
there a specific rule for the source of mandatory fields which must appear
in the header of each piece? [obviously the Content-Type, MIME-Version, and
Message-ID fields will be generated by the entity performing the fragmentation;
given the rules for reassembly, the originator (From, Sender, Reply-To, etc.),
recipient, and Date fields had better be copied from the original message, or 
that
information will be lost (i.e. the fragmentation/reassembly process won't be
transparent -- but there doesn't appear to be a rule that says that that should
be done, and one interpretation of rule 3 is that copying To, From, Date, etc.
(i.e. fields not in F) to the first piece header is forbidden]

Transparency raises a third issue: the process as demonstrated by the
example given *isn't* transparent; any semantics conveyed by X-Weird-Header-1
and/or X-Weird-Header-2 have been lost in the fragmentation/reassembly process.
Any such semantics (in the specific case of X- fields) would have to be agreed
among the sender and recipients, but might be unknown to intermediate entities,
including those that might fragment the original message or one of its pieces
while in transit.  Something seems to be missing from the rules if transparency
is to be maintained as stated in the first paragraph of 5.2.2.1 (Rule 3
implies that if transparency is to be maintained, all fields not in the set F
that are in the original message should be copied (or moved) to the header of
the initial enclosing message when fragmenting -- but clearly that hasn't been
done in the example w.r.t. the X-Weird-Header-* fields, and there doesn't seem
to be an explicit rule covering this situation).  Apart from the X- fields in 
the
example, this issue also affects standard and extension fields not in the set F,
such as Comments, Keywords, References, In-Reply-To, Resent- fields, etc.

I believe that an additional rule resolves most of these issues rationally:

    (5)   All of the header fields from the enclosed message,
          except those that start with "Content-" and the
          specific header fields "Subject", "Message-ID",
          "Encrypted", and "MIME-Version", MUST be copied, in
          order, to the header of the initial enclosing message.
          Fields so copied MAY be elided from the enclosed
          message fields which appear in the body of the initial
          piece, and those which are mandatory in a message
          SHOULD be copied to the header of subsequent pieces.
          All pieces MUST meet the message format requirements;
          in particular, if the message format limits the number
          of instances of a field type in a message and that
          number of instances appeared in the enclosed message,
          an entity generating pieces MUST NOT exceed the limit
          by producing another instance of that field type in
          the initial enclosing message header.

In addition, rule 3 could be clarified by appending "by the reassembly
process" to the end of the last sentence.

That changes the example to:

   If an audio message is broken into two pieces, the first piece might
   look something like this:

     X-Weird-Header-1: Foo
     X-Weird-Header-1: Bar
     X-Weird-Header-2: Hello
     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail (part 1 of 2)
     Message-ID: <id1(_at_)host(_dot_)com>
     MIME-Version: 1.0
     Content-type: message/partial; id="ABC(_at_)host(_dot_)com";
                   number=1; total=2

     X-Weird-Header-1: Bar
     X-Weird-Header-2: Hello
     Message-ID: <anotherid(_at_)foo(_dot_)com>
     Subject: Audio mail
     MIME-Version: 1.0
     Content-type: audio/basic
     Content-transfer-encoding: base64

       ... first half of encoded audio data goes here ...

   and the second half might look something like this:

     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail (part 2 of 2)
     MIME-Version: 1.0
     Message-ID: <id2(_at_)host(_dot_)com>
     Content-type: message/partial;
                   id="ABC(_at_)host(_dot_)com"; number=2; total=2

       ... second half of encoded audio data goes here ...

   Then, when the fragmented message is reassembled, the resulting
   message to be displayed to the user should look something like this:

     X-Weird-Header-1: Foo
     X-Weird-Header-1: Bar
     X-Weird-Header-2: Hello
     From: Bill(_at_)host(_dot_)com
     To: joe(_at_)otherhost(_dot_)com
     Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
     Subject: Audio mail
     Message-ID: <anotherid(_at_)foo(_dot_)com>
     MIME-Version: 1.0
     Content-type: audio/basic
     Content-transfer-encoding: base64

       ... first half of encoded audio data goes here ...
       ... second half of encoded audio data goes here ...

   Which is the original massage plus fields provided by the fragmenting
   entity and transport (in this case the "X-Weird-Header-1: Foo" field).