Re: RFC 2046 message/partial (sect. 5.2.2) clarification


Adam M. Costello wrote:

   (5)   All of the header fields from the enclosed message,



You mean the message to be enclosed; it hasn't yet been enclosed.


Yes.

         except those that start with "Content-" and the specific
         header fields "Subject", "Message-ID", "Encrypted", and
         "MIME-Version", MUST be copied, in order, to the header of
         the initial enclosing message.  Fields so copied MAY be
         elided from the enclosed message fields which appear in the
         body of the initial piece,



That looks like a correct deduction.  Well, I'm not so sure that the
order MUST be preserved.  They MUST be copied, and based on RFC 2822
ordering requirements, the order SHOULD be preserved, and the order of
trace and resent fields MUST be preserved.


OK (the text was copied and modified from rule 2, so the same objections
apply to rules 2 and 3).  I also assumed it's clear that the RFC 2298
rules still apply, i.e. the 3 fields mentioned in 2298 are in the same
category as Message-ID above.  This also raises another transparency
issue: if the original message (all fields to be preserved) has fields

 Subject
 Date
 Message-ID
 From
 MIME-Version
 To
 Content-Type
 References
 Disposition-Notification-To

in that order, the reassembled message will have the F and non-F fields
segregated rather than in the original order; the reassembled message
field order might be

 Date
 From
 To
 References
 Subject
 Message-ID
 MIME-Version
 Content-Type
 Disposition-Notification-To

I.e. if the MUST be copied in order requirement is in support of
transparency, it's inadequate to truly preserve transparency. Weakening
the MUST to a SHOULD certainly wouldn't help :-)  Not a catastrophe, though.

         and those which are mandatory in a message SHOULD be copied
         to the header of subsequent pieces.



That might be a good recommendation, but it would be new; it's not
implied by the existing reassembly spec.  For example, when a message is
fragmented, the Date field is copied into the first enclosing header.
The Date fields of the other enclosing headers could be copies of that
one, or they could indicate the time that the fragmentation took place.
Either behavior seems reasonable to me, and both result in the correct
reassembled message.


OK, but there's a slightly related issue that 2046 doesn't address, viz.
transport of the pieces after fragmentation.  Obviously they should be
sent with the same forward and reverse path and mechanism as the
unfragmented original would have used had fragmentation not taken place.
It seems desirable that likewise the (mandatory) originator and
recipient header fields which each piece must have should be copied from
the original. Specifically, it seems unreasonable for a fragmentation
agent to use bogus originator or recipient information in order to
satisfy the requirement that originator and recipient header fields be
present in each piece's header.

Various applications have different mandatory field requirements. For
example, a Path header field is required for a Usenet article (RFC 1036).
The Path field is rather important, and it would be unreasonable for a
fragmentation agent in a Usenet environment to use any Path field for
fragments other than that of the original.

There is also the issue of reassembly, which might have to be handled by
a UA rather than by some transport process. Using the same header fields
in subsequent pieces as in the first piece will tend to group the pieces
in UA display, making reassembly somewhat simpler.

Technically, per RFC 2822 section 3.6.1 the Date field in the pieces
should be the date-time when the piece was prepared (by the fragmentation
agent) for transport, but that precludes preserving the original message
date through the fragmentation/reassembly process.  And the UA grouping
issue also applies to the date-time.  If Date were in the same list as
Message-ID, this wouldn't be an issue; the RFC 2822 sect. 3.6.1 semantics
would be fine.  RFC 822 (the current Standard and the basis for 2046)
doesn't have normative text corresponding to 2822 sect. 3.6.1.  That
looks like it will be an issue for 2822 as it advances on the Standards
Track and/or for a 2046 revision.

         in particular, if the message format limits the number of
         instances of a field type in a message and that number
         of instances appeared in the enclosed message, an entity
         generating pieces MUST NOT exceed the limit by producing
         another instance of that field type in the initial enclosing
         message header.



The "in particular" connector doesn't make sense, because this is
talking about ensuring the correct formatting of the reassembled
message, not the pieces.

More importantly, it's overstated.  For example, the message format
limits the number of Date fields to 1, but an entity generating pieces
MAY put instances of Date in both the enclosed header and the first
enclosing header, because the instance in the enclosed header will be
ignored.  The restriction you describe applies only to F fields.


What I had in mind was the pieces and non-F fields.  If those are
to be preserved through the fragmentation/reassembly process, they
have to appear in the initial piece message header, and that precludes
the fragmentation agent from adding its own instance of any such
field. Whether such a field also appears in the enclosed message
header (i.e. in the body of the initial piece) is irrelevant, because,
as you say, it will be ignored on reassembly, and (being in the body)
is ignored during transport of the first piece. Date actually is a good
example because of the issue mentioned above: to preserve the original it
has to be placed in the initial piece message header, which violates
RFC 2822 section 3.6.1, and the fragmenting agent can't comply with
that provision of 2822 by adding another Date field because there can be
only one Date field. The same applies to other fields with limits, e.g.
References.  I wanted to guard against agents that might perform
fragmentation as well as some other process (e.g Usenet moderation,
mailing list expansion, gateways of various types) adding an instance
of a field as well as copying (or moving) an instance of the same field
type from the original message where that would result in violating a limit.

By the way, the assembled message will have Received fields reflecting
the path taken by the first part, but the paths taken by the other parts
will be lost.  Hmmm...


Yes, I thought of that. It seems to be unavoidable and mostly harmless.
If there's a need to examine trace fields (Return-Path as well as
Received), that would have to be done on the individual pieces
before reassembly.