ietf-822
[Top] [All Lists]

Re: Revisiting RFC 2822 grammar (obs-utext and unstructured)

2004-02-04 10:12:34

In <402042BC(_dot_)2050204(_at_)verizon(_dot_)net> Bruce Lilly 
<blilly(_at_)verizon(_dot_)net> writes:

unstructured = *(text [FWS])
  assuming unstructured fields are defined as in my revised grammar, e.g.
  comments = "Comments" ":" [FWS] unstructured CRLF
  (see discussion below)
  optionally one could define
  utext = *(text [FWS])
  and then define unstructured as utext, but what would be the point...

obs-utext either as defined in 2822 or as above, i.e. empty, can start
  or end with obs-char, CR, or LF, but can't have CRLF pair

Yes, but that is getting a long way from what seems to be the established
convention that *text things consist of just a single character (or
perhaps a single character with some naked CF or LF attached).

obs-unstructured = *(obs-utext FWS) [obs-utext]
  i.e. cannot have two adjacent instances of obs-utext strings (must
  have FWS separator), may have multiple adjacent FWS instances (since
  obs-utext may be empty, and in order to comply with the section 4
  normative text regarding parsing of WS-only continuation lines), may
  be empty, may begin or end with any obs-utext string or with FWS,
  any CRLF pair is followed by WS (as part of FWS)

And I don't think we want two adjacent FWS. Your revised grammar went to
much trouble to avoid adjacent CFWS (or FWS in some cases), and that was
seen as a great improvement. Now they seem to have come back in.

May I suggest you take another look at the grammar I originally suggested.

Note that an unstructured field body begins with [FWS], explicitly at
least in the non-obs cases.  Therefore, in
  Subject: foo
the field body is " foo", not "foo", and
  Subject: Re: foo
begins with " Re:", not with "Re:", so the wording of section 3.6.5
should be revised (or the field name/field body delimiter formally
redefined to include any [FWS] or [CFWS} (as the case may be) following
the colon). [And A.2 needs to warn about line length limits, including
those in effect when encoded-words are present; "prepending" is
inadequate as a means of implementation.]

Yes, I regard that as a problem. What people usually mean by a "subject"
starts at the first non-blank character, and a subject consisting of just
FWS would normally be described as "empty". An instruction in some
protocol that "the subject of the reply SHOULD be the same as the subject
of the original" should not extend to having exactly the same number or
initial SPs. So that needs thinking about.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 Clerewood Ave, 
CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5