ietf-822
[Top] [All Lists]

Comment on encapsulation BNF in latest RFC-XXXX

1991-10-24 10:58:28
In the Altanta IETF meeting, Ned agreed with me that there was a
need to have a CRLF before each boundary marker (except the first one.)
Apparently, the suggestion was not added into the latest RFC-XXXX.

The current RFC-XXXX multipart boundary marker is still based on the
concept of "line."  That is, the boundary marker must be at the beginning
of a line.  This mechanism works fine under the current SMTP restrictions.
But I would like to see this mechanism also work well for unencoded binary
data.

Unencoded binary data is not always terminated by CRLF, it is sometimes
true for text data.  Based on the current boundary marker syntax, it requires
CRLF to be padded at the end of the contents if it is not terminated
by CRLF.  This CRLF padding is irreversible; the recipient UA has no idea
if the CRLF is padded or not.  The padding does not distort text data very
much.  But it may significantly distort binary data, especially when the
binary data has a built-in checksum.

Here is the demonstration of binary data which is not terminated by CRLF
and uses the current boundary marker syntax:


From: lau(_at_)Eng(_dot_)Sun(_dot_)COM
To: ietf-822
Content-Type: multipart; OILKJI9823ljsd98u3

--OILKJI9823ljsd98u3
Content-Type: BINARY/x-something
Content-Transfer-Encoding: binary

^(_at_)^A^Y8^&ii^%^E^(_at_)^A^Y8^&ii^%^E^(_at_)^L^YI^Pik^@^A^Y8^&ii^%^X
--OILKJI9823ljsd98u3
Content-Type: TEXT/us-ascii

As you see, it is unclear if the CRLF is padded or originally belongs to
the binary contents.  It is irreversible and may severely damage the binary
contents.
--OILKJI9823ljsd98u3--


By changing the BNF of "encapsulation" to

body := 1*encapsulation close-delimiter

encapsulation := delimiter CRLF message-encapsulation CRLF
                                                      ^^^^
delimiter := "--" <boundary-spec from Content-type field>


The new boundary makers for binary and text body parts will look like:


From: lau(_at_)Eng(_dot_)Sun(_dot_)COM
To: ietf-822
Content-Type: multipart; OILKJI9823ljsd98u3

--OILKJI9823ljsd98u3
Content-Type: BINARY/x-something
Content-Transfer-Encoding: binary
 
^(_at_)^A^Y8^&ii^%^E^(_at_)^A^Y8^&ii^%^E^(_at_)^L^YI^Pik^@^A^Y8^&ii^%^X
--OILKJI9823ljsd98u3
Content-Type: TEXT/us-ascii
  
The boundary marker is still at the beginning of the line.  The CRLF
after "^X" is part of the encapsulation syntax, not the contents.  So,
CRLF padding becomes reversible.

--OILKJI9823ljsd98u3--


-Vincent