[Top] [All Lists]

Re: Revisiting RFC 2822 grammar (quoted-pair)

2004-01-18 03:52:43

Adam M. Costello wrote:

Bruce Lilly <blilly(_at_)verizon(_dot_)net> wrote:

Moreover, the semantics of \CRLF as given in RFC 822 are quite
different from \CR followed by a lone LF.

I'm having trouble making sense of \CRLF in RFC 822.

Consider the following example:

From: "foo\
bar" <blah(_at_)example>

This could not have been created by starting with a one-line field
and then folding it, because 3.1.1 says that folding may happen
"wherever there may be linear-white-space (NOT simply LWSP-chars)".
Linear-white-space is allowed in qtext, but not in quoted-pair, so there
would be no way for the folding process to wedge a CRLF between the
backslash and the next CHAR.

The only way to parse this field, according to the grammar, is as:

"From" ":" <"> CHAR CHAR CHAR quoted-pair CHAR CHAR CHAR CHAR CHAR <"> ...
From   :   "   f    o    o       \CR      LF   SP   b    a    r    "  ...

There is no way to parse the field in a way that involves the
linear-white-space token, and no way to parse it in a way that involves
the CRLF token.

Question:  Can this field be unfolded?
I'd interpret that example as a field which is not folded (the backslash-escaped CRLF is not line folding because it is escaped) in which the quoted-string contains a CRLF. That CRLF is escaped so that it is preserved for the application (an unescaped CRLF would be part of line folding, which would not be visible to the application). Consider as an
alternative example:

From: "Foo Bar" <"foo\

[ignoring for the moment whether or not a CRLF in a local-part is either sensible or

This all seems like a big mess that should be deprecated.  And indeed,
in RFC 2822, lone CR, lone LF, and \CR are all relegated to obsolete
syntax.  I'm thinking that was a good move.
Perhaps it should be deprecated; on the other hand, 2822 provides no mechanism for passing CR, LF, or NUL at the application layer, as quoting is not permitted for those octets (except via obs- constructs, which may not be used for message generation). If backslash-escaped CR and LF (ignoring NUL for the moment) were permitted, one
could have:

From: "foo\CR\LFbar" <blah(_at_)example>

etc., which ought to present no problems; there is no explicit CRLF pair on the wire, so folding/unfolding isn't an issue, the application layer still gets the CR and LF octets when parsing the quoted string, and it is backwards-compatible (i.e. that was legal in 822 and
semantics are unchanged).

But let's clearly document the changes from 822!

As for the obsolete grammar, parsing \CRLF as \CR followed by LF is
consistent with the 822 grammar, even if it doesn't seem to jibe with
the phrase "quoted CRLF" in the prose.

Maybe; handling of WSP after \CRLF would seem to be somewhat different (I agree with
you that 822 isn't quite clear about that)