Re: Content-Canonicalization: crlf?

I have recently been looking at what's involved for implementing MIME with
the new security multiparts draft and have come upon what seems to be a
general MIME ambiguity.  For a received message, it's not clear when one
should convert a body part from canonical line ending to local line ending.


Of course it isn't clear. How this is done depends completely on the
receiving system. The rules for doing this on UNIX, VMS, Macs, DOS, and
various other platforms are all different, and depend on the type of material
involved.

I doubt very much that you would want to know when conversion to counted
record formats is appropriate on VMS systems and when it isn't.

Current practice seems to be to either let it be done by the mechanisms in
place now which convert non-MIME messages (e.g., the UNIX MTA).  This works
for body parts that have either no or qp transfer encoding.  Then, in the
case of base 64 to undo the canonicalization for types text/* and
message/rfc822.


MIME specifies the on-wire format. Once a message leaves this environment
any transformations done are completely outside the scope of what we
specify here. If the transformations are inconsistent with the types
involved (and on many systems they are) this has to be taken into account
when processing the message

I've taken a look at 1521 and the MIME conformance draft and it's very
clear how a message should be constructed, but it's not clear how a message
should be unpacked.  It seems this should be clarified at the very least.


I strongly disagree. We've been over this countless times, and found that
attempts to "clarify" end up having the exact opposite effect.

Another possible solution seems to me to be the introduction of a
content-canonicalization: header to indicate canonicalizations have been
applied and should be removed when unpacking.  I think this may be a good
solution since it is canonicalization is appropriate with different
content-types: (e.g., application/postscript is a candidate for line ending
canonicalization).  Also canonicalization should not be associated with
transfer encoding.  At the moment line ending canonicalization is the only
obvious transformation, but there might be others.  One other one I can
think of might be conversion to IEEE standards for binary representation of
floating point numbers.


Adding such a field to the on-wire format would complicate MIME immensely and
provide no benefit that I can see. Are you prepared to implement support for
big- and little-endian word counted material in your product? Not to mention
CR, LF, and CRLF terminators, fixed length records, and any of a variety
of other formats used on various platforms today. I doubt it. Yet
these are the formats I deal with in *my* product every day, and it would be
great if I could copy them verbatim out onto th network without any
transformation.

The entire point of defining a canonical form is so that we don't have issues
like these when we process material. Each type of object has a single
representation, and if that doesn't agree with what a given system uses
it has to be changed before it will work.

Local implementation of such a field might make sense, but any such
implementation is a local matter by definition and has no business being
in a standards-track document.

                                Ned