RFC 2046 section 5.2.2.1 contains the following text:
The semantics of a reassembled partial message must be those of the
"inner" message, rather than of a message containing the inner
message. This makes it possible, for example, to send a large audio
message as several partial messages, and still have it appear to the
recipient as a simple audio message rather than as an encapsulated
message containing an audio message. That is, the encapsulation of
the message is considered to be "transparent".
When generating and reassembling the pieces of a "message/partial"
message, the headers of the encapsulated message must be merged with
the headers of the enclosing entities. In this process the following
rules must be observed:
(1) Fragmentation agents must split messages at line
boundaries only. This restriction is imposed because
splits at points other than the ends of lines in turn
depends on message transports being able to preserve
the semantics of messages that don't end with a CRLF
sequence. Many transports are incapable of preserving
such semantics.
(2) All of the header fields from the initial enclosing
message, except those that start with "Content-" and
the specific header fields "Subject", "Message-ID",
"Encrypted", and "MIME-Version", must be copied, in
order, to the new message.
(3) The header fields in the enclosed message which start
with "Content-", plus the "Subject", "Message-ID",
"Encrypted", and "MIME-Version" fields, must be
appended, in order, to the header fields of the new
message. Any header fields in the enclosed message
which do not start with "Content-" (except for the
"Subject", "Message-ID", "Encrypted", and "MIME-
Version" fields) will be ignored and dropped.
(4) All of the header fields from the second and any
subsequent enclosing messages are discarded by the
reassembly process.
and section 5.2.2.2 gives an example:
If an audio message is broken into two pieces, the first piece might
look something like this:
X-Weird-Header-1: Foo
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail (part 1 of 2)
Message-ID: <id1(_at_)host(_dot_)com>
MIME-Version: 1.0
Content-type: message/partial; id="ABC(_at_)host(_dot_)com";
number=1; total=2
X-Weird-Header-1: Bar
X-Weird-Header-2: Hello
Message-ID: <anotherid(_at_)foo(_dot_)com>
Subject: Audio mail
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here ...
and the second half might look something like this:
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail (part 2 of 2)
MIME-Version: 1.0
Message-ID: <id2(_at_)host(_dot_)com>
Content-type: message/partial;
id="ABC(_at_)host(_dot_)com"; number=2; total=2
... second half of encoded audio data goes here ...
Then, when the fragmented message is reassembled, the resulting
message to be displayed to the user should look something like this:
X-Weird-Header-1: Foo
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail
Message-ID: <anotherid(_at_)foo(_dot_)com>
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here ...
... second half of encoded audio data goes here ...
RFC 2298 amends 2046 by adding Original-Recipient, Disposition-Notification-To,
and Disposition-Notification-Options to the list of fields including Message-ID.
For convenience, I'll refer to the set of fields named above (as amended) plus
MIME-content fields as F.
The RFC 2046 text isn't exactly clear about fragmentation and original
message header fields, and the example doesn't show the original message
fields.
Presumably the original message was RFC 822 compliant and therefore had a
Date field and at least one originator field and at least one recipient field.
Those fields are not in F, nor do they appear in the "fields" in the first
piece's message body. However the first piece body does contain X-Weird-Header-1
and X-Weird-Header-2 which are not in F but have neither been copied to the
enclosing message header nor elided. So the first question is: which fields
from the original message (if any) are to be elided from the "fields" in the
first piece body when fragmenting a large message?
Rule 2 refers to "the initial enclosing message" and to "the new message",
which in the case of generating the first piece appear to refer to the
same message and therefore that rule has a null effect in that case. Rule
3 provides for copying the set of fields F (and no others) from the
original message to the new message. That raises a second question: is
there a specific rule for the source of mandatory fields which must appear
in the header of each piece? [obviously the Content-Type, MIME-Version, and
Message-ID fields will be generated by the entity performing the fragmentation;
given the rules for reassembly, the originator (From, Sender, Reply-To, etc.),
recipient, and Date fields had better be copied from the original message, or
that
information will be lost (i.e. the fragmentation/reassembly process won't be
transparent -- but there doesn't appear to be a rule that says that that should
be done, and one interpretation of rule 3 is that copying To, From, Date, etc.
(i.e. fields not in F) to the first piece header is forbidden]
Transparency raises a third issue: the process as demonstrated by the
example given *isn't* transparent; any semantics conveyed by X-Weird-Header-1
and/or X-Weird-Header-2 have been lost in the fragmentation/reassembly process.
Any such semantics (in the specific case of X- fields) would have to be agreed
among the sender and recipients, but might be unknown to intermediate entities,
including those that might fragment the original message or one of its pieces
while in transit. Something seems to be missing from the rules if transparency
is to be maintained as stated in the first paragraph of 5.2.2.1 (Rule 3
implies that if transparency is to be maintained, all fields not in the set F
that are in the original message should be copied (or moved) to the header of
the initial enclosing message when fragmenting -- but clearly that hasn't been
done in the example w.r.t. the X-Weird-Header-* fields, and there doesn't seem
to be an explicit rule covering this situation). Apart from the X- fields in
the
example, this issue also affects standard and extension fields not in the set F,
such as Comments, Keywords, References, In-Reply-To, Resent- fields, etc.
I believe that an additional rule resolves most of these issues rationally:
(5) All of the header fields from the enclosed message,
except those that start with "Content-" and the
specific header fields "Subject", "Message-ID",
"Encrypted", and "MIME-Version", MUST be copied, in
order, to the header of the initial enclosing message.
Fields so copied MAY be elided from the enclosed
message fields which appear in the body of the initial
piece, and those which are mandatory in a message
SHOULD be copied to the header of subsequent pieces.
All pieces MUST meet the message format requirements;
in particular, if the message format limits the number
of instances of a field type in a message and that
number of instances appeared in the enclosed message,
an entity generating pieces MUST NOT exceed the limit
by producing another instance of that field type in
the initial enclosing message header.
In addition, rule 3 could be clarified by appending "by the reassembly
process" to the end of the last sentence.
That changes the example to:
If an audio message is broken into two pieces, the first piece might
look something like this:
X-Weird-Header-1: Foo
X-Weird-Header-1: Bar
X-Weird-Header-2: Hello
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail (part 1 of 2)
Message-ID: <id1(_at_)host(_dot_)com>
MIME-Version: 1.0
Content-type: message/partial; id="ABC(_at_)host(_dot_)com";
number=1; total=2
X-Weird-Header-1: Bar
X-Weird-Header-2: Hello
Message-ID: <anotherid(_at_)foo(_dot_)com>
Subject: Audio mail
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here ...
and the second half might look something like this:
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail (part 2 of 2)
MIME-Version: 1.0
Message-ID: <id2(_at_)host(_dot_)com>
Content-type: message/partial;
id="ABC(_at_)host(_dot_)com"; number=2; total=2
... second half of encoded audio data goes here ...
Then, when the fragmented message is reassembled, the resulting
message to be displayed to the user should look something like this:
X-Weird-Header-1: Foo
X-Weird-Header-1: Bar
X-Weird-Header-2: Hello
From: Bill(_at_)host(_dot_)com
To: joe(_at_)otherhost(_dot_)com
Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
Subject: Audio mail
Message-ID: <anotherid(_at_)foo(_dot_)com>
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here ...
... second half of encoded audio data goes here ...
Which is the original massage plus fields provided by the fragmenting
entity and transport (in this case the "X-Weird-Header-1: Foo" field).