ietf-822
[Top] [All Lists]

Newline problem: Another stab

1992-03-03 07:47:31
Sigh.  Why does this one little thing seem to be so hard?

Excerpts from internet.ietf-822: 3-Mar-92 Re: The newlines problem...
Erik M. van der Poel(_at_)sra (1777)

The above paragraph could be made clearer by indicating that "text"
must be converted to the Internet CRLF format before applying the
quoted-printable encoding, and that other, non-text objects should not
be converted to CRLF format before q-p'ing.

Nathaniel's objection that UNIX's sendmail expects text in the local
newline format is resolved by your comment (in the 2nd diagram) that
the output of the quoted-printable encoder should be in local newline
format. This is just a repeat of what others have said about "adding
another level" of encoding. Are we going around in circles again? 
Again, I'm sorry.

This doesn't make sense to me.  Now it sounds like you're proposing the
following -- using UNIX as an example, but far from the only one:

1.   UNIX UA lets user compose mail, with LF for newlines.

2.  Before QP encoding, LFs are changed to CRLFs.  (This is the step I
believe to be wrong.)

3.  QP encoding takes place.

4.  QP output has CRLFs changed BACK TO LF(!!!).

5.  Sendmail gets the mail with LFs and changes them to CRLFs.

Well, this works, I guess, but it strikes me as mind-bogglingly silly,
especially for UA's that have the QP encoding built in.  But in that
case, it seems to me that it makes a lot more sense to just skip steps 2
and 4.

Here's another proposal.  How about if we added a couple of paragraphs
something like this (inspired by Dave & Einar's comments, but my fault
if the details are wrong):

LINE BREAKS IN QUOTED-PRINTABLE TEXT.  The intent of the
quoted-printable encoding is to effect NO CHANGE on line breaks.  This
means that however line breaks are represented in plain text, that is
how they are represented in quoted-printable encoded data.  This permits
all of the mechanisms that have evolved for dealing with line breaks in
Internet mail, despite the divergence of local representations for line
breaks, to continue to work with quoted-printable data.  The only thing
in the quoted-printable encoding that affects the interpretation of line
breaks is the notion of soft line breaks, which causes certain line
breaks to become non-significant in decoding quoted-printable-encoded
data.  Even here, the notion of what represents a non-significant line
break, as with significant line breaks, is to be precisely the same as
the representation of a line break in textual data.

EXPLANTORY NOTE:  While RFC 821 and RFC 822 are quite clear that CRLF is
the representation for line breaks, existing practice has clearly
established large enclaves in which the representation format is
otherwise. It is not the intent of the quoted-printable encoding to
require ANY special treatment for line breaks in such enclaves.  For
that reason, it is specified that line breaks in quoted-printable data
be treated precisely as line breaks in plain ASCII text mail.  If this
is inadequate for the transport of certain data types -- and it will be
inadequate for non-line-oriented, binary data -- the base64 encoding
should be used for such data.

Does that help?  Can people live with something like that?  -- Nathaniel

<Prev in Thread] Current Thread [Next in Thread>