ietf-822
[Top] [All Lists]

Re: Revisiting RFC 2822 grammar (obs-utext and unstructured)

2004-02-05 05:12:40

In <4021BB72(_dot_)6010106(_at_)verizon(_dot_)net> Bruce Lilly 
<blilly(_at_)verizon(_dot_)net> writes:

Charles Lindsey wrote:

Yes, but that is getting a long way from what seems to be the established
convention that *text things consist of just a single character (or
perhaps a single character with some naked CF or LF attached).

?!?  "text" should instantiate a single character, and it does both
in 2822 and in the revised grammar.  "*text" by definition (see RFC
2234) means any number (0 to infinity) of occurrences of "text" and
therefore could not possibly be restricted to a single character.

Oh Dear! You misread what I wrote. When I said "*text", I didn't meant
"*text", I meant the set of rules named 'utext', 'ctext', 'dtext', etc"
which all have the property that they produce just a single character.
Your latest offering seemed to be breaking with that convention.

And I don't think we want two adjacent FWS. Your revised grammar went to
much trouble to avoid adjacent CFWS (or FWS in some cases), and that was
seen as a great improvement. Now they seem to have come back in.

RFC 2822 section 4:

  Another key difference between the obsolete and the current syntax is
  that the rule in section 3.2.3 regarding lines composed entirely of
  white space in comments and folding white space does not apply.  See
  the discussion of folding white space in section 4.2 below.

Again, you miss my point. That rule (or lack of it in the obs case) is not
enforced by syntax but by verbiage. Allowing the grammar to produce
'FWS FWS' does not of itself cause 3.2.3 to cease to apply.

My point was that 'FWS FWS' means exactly the same as 'FWS' (well, to be
pedantic, in insists on a minimum of 2 WSP). All it does it to provide
further opportunity to confuse a parser, which was why you went to the
trouble of not allowing it to occur in the rest of the syntax.


I still think it's best to simply leave Subject defined as unstructured,
with no comment about "Re:" (since, as you pointed out, that is already
allowed by "unstructured"), just as RFC 822 did.  Anything else inevitably
imposes structure on Subject.  For example if one starts with
  Subject:          foo
what should an implementor do to add "Re:":
  Subject:Re:           foo
  Subject:          Re: foo
  Subject: Re:         foo
etc.? 

What I would like him to do would be
    Subject: Re: foo
but the proper way to address that issue would be to establish some rule
or convention about whether subjects could be refolded or have WSP
collapsed in the course of generating a followup/reply.

But following a strict reading of 3.6.5 in RFC 2822, I would argue that
the only compliant way would be
    Subject:Re:           foo
from which it is evident that every MUA implementation known to me is
non-compliant with RFC 2822 :-( . Another bug for Pete to worry over...

Actually, your revised grammar arranges that any FWS or CFWS that is
encountered is automatically associated with the non-WS-object to its
right. You could equally well have written it so that any FWS or CFWS was
automatically associated with the object to its left. Then the problem
would go away (or, more accurately, would be transferred to the end of an
unstructured where it would no harm).

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 Clerewood Ave, 
CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5