[Top] [All Lists]

Re: Revisiting RFC 2822 grammar (quoted-pair)

2004-01-18 06:29:58

Bruce Lilly <blilly(_at_)verizon(_dot_)net> wrote:

Adam M. Costello wrote:

Consider the following example:

From: "foo\
bar" <blah(_at_)example>

You have misquoted me (or more likely, your MUA has).  The example was:

From: "foo\
 bar" <blah(_at_)example>

The space before "bar" is crucial, because the alternative (without the
space) is another equally interesting example:

From: "foo\
bar" <blah(_at_)example>

In fact, when you presented your own example, I thought you omitted the
space deliberately:

From: "Foo Bar" <"foo\

But now I suspect that your MUA ate the space, and what you originally
intended was:

From: "Foo Bar" <"foo\

But in any case both are worth considering.

I'd interpret that example as a field which is not folded (the
backslash-escaped CRLF is not line folding because it is escaped) in
which the quoted-string contains a CRLF.

We can certainly find evidence in RFC 822 to support that
interpretation, but we can also find evidence to doubt it.  As I
mentioned, 3.4.8 says "Each header field may be represented on exactly
one line consisting of the name of the field and its body, and
terminated by a CRLF; this is what the parser sees."  The proposed
interpretation has the parser seeing the field as two lines (because
the parser sees the line break inside the quoted-string).  Also, what
could 3.4.5 be talking about when it says "Quoted CRLFs (i.e., a
backslash followed by a CR followed by a LF) are also subject to rules
of folding"?

Finally, is it valid not to have a space after \CRLF?  As in:

From: "Foo Bar" <"foo\

The grammar allows it, and if we go with the proposed interpretation
that \CRLF is not an instance of folding, then what would require a
space after the \CRLF?

There is a related question for the other example:

From: "Foo Bar" <"foo\

Is the space part of the local part?  3.4.5 says "Stripping off the
first following LWSP-char is also appropriate when parsing quoted CRLFs"
(the "also" means "similar to the case of unquoted CRLFs").  But why?
Why put the space in only to have it stripped out again?  The only
reasonable explanation is that the space is required (this gets back to
the previous question).  Only if it's required do you need a rule about
stripping it, so that you can have a quoted-string whose meaning is
fooCRLFbar with no space.

We now have two arguments that space is required after \CRLF, but that
in turn argues that \ CR LF LWSP-char is indeed some sort of folding.
Maybe it's a fold that cannot be unfolded.  Except there's still that
pesky statement in 3.4.8 that "Each header field may be represented on
exactly one line".

I don't expect us to be able to settle this.  I think RFC 822 is not
self-consistent on this issue.

If backslash-escaped CR and LF (ignoring NUL for the moment) were
permitted, one could have:

From: "foo\CR\LFbar" <blah(_at_)example>

etc., which ought to present no problems;

Until it gets converted to the local line-ending conventions.  Your
example contains all three possibilities: CR not followed by LF, LF not
preceeded by CR, and CRLF (terminating the field).  Imagine saving this
message to an mbox file on a Unix machine, where lines are terminated
by LF.  How will you do it?  Normally CRLF gets translated to LF, but
that's not reversible if the input already contains LF not preceeded by

Other problems:  How would this field display?  Could it be cut and

I think any sort of control characters in header fields, other than
CRLF (as a unit) and maybe TAB, is asking for headaches.  Even TAB is
somewhat troublesome.

But let's clearly document the changes from 822!



P.S. The tendency of your MUA to drop spaces at the beginnings of lines
is probably related to its use (or misuse) of format=flowed.

I notice that each of your paragraphs consists of multiple "paragraphs"
in the
format=flowed sense, so that they don't actually flow, but instead end
up looking
like this paragraph.