Re: Revisiting RFC 2822 grammar (quoted-pair)

2004-01-16 09:32:50

On 1/16/04 at 2:43 AM -0500, Bruce Lilly wrote:

Pete Resnick wrote:

On 1/15/04 at 7:09 PM +0100, Arnt Gulbrandsen wrote:

1. Bruce elimites obs-qp, even though it can match a few pairs quoted-pair cannot, such as "\" DEL.

Yup, I think that's Bruce's mistake.

Of course, as Bruce points out, DEL is in there, so this is no problem.

On 1/15/04 at 9:59 PM +0000, Charles Lindsey wrote:
In <3FF7A5FC(_dot_)9080804(_at_)verizon(_dot_)net> 
blilly(_at_)verizon(_dot_)net writes:

quoted-pair     =       ("\" text)
[N.B. had redundant obs-qp alternative]

I think not. The obs version allows \NUL, \CR and \LF, which the regular version does not.

RFC 2822 gives quoted-pair as:

quoted-pair     =       ("\" text) / obs-qp

and text as:

text            =       %d1-9 /         ; Characters excluding CR and LF
                       %d11 /
                       %d12 /
                       %d14-127 /

and obs-text (N.B. included in text) as:

obs-text        =       *LF *CR *(obs-char *LF *CR)

with obs-char defined as:

obs-char        =       %d0-9 / %d11 /          ; %d0-127 except CR and
                       %d12 / %d14-127         ;  LF

DEL is %d127, which is explicitly included in text and obs-char. Indeed, text explicitly includes every US-ASCII character by value except NUL, CR, and LF, and includes those as well via obs-text (explicitly permitting a single CR or LF) and obs-char (which includes NUL). I don't see any single US-ASCII character which quoted-pair doesn't permit after the backslash, without having to resort to obs-qp.

Maybe that wasn't intended by RFC 2822, but there it is.

Yup, you're right, but that wasn't intended by 2822. Or more to the point, it's not obvious to me that it was intended that bare CR or bare LF should appear in the obs- version of body, which is the result.

As an aside, note that obs-text may consist of multiple octets, so "\foo" could be considered a quoted-pair ("foo" matches obs-text via *(obs-char *LF *CR)). I believe that's a problem with 2822.

I agree that it is a bug.

On the other hand, 822 specifically mentioned the multi-character \CRLF and gave its semantics, but 2822 doesn't seem to permit that.

\CRLF is a quoted pair with CR followed by a bare LF. The current syntax of 2822 permits that, leaving aside the question of whether or not it should.

