ietf-822
[Top] [All Lists]

Re: Interpretation of RFC 2047

2002-12-18 02:58:08

On Tue, 17 Dec 2002, Bruce Lilly wrote:
One cannot recognise a comment unless the header field syntax is known.

One can recognise a comment from lexical analysis alone.  This was true
in RFC 822, and should still be true in RFC 2822 unless something went
wrong.

   Content-Features: (& (Type="text/plain") (charset=US-ASCII) )

contains no comments.

RFC 822 was absolutely clear that it contains a comment.  By my reading
of RFC 2822 section 3.2.3, it still contains a comment.

RFC 2912 suggests that the above Content-Features header field contains
no comments.  But RFC 2912 was published before RFC 2822, so cannot use
any sophistry about RFC 2822 perhaps having unintentionally changed the
definition of a comment.  Instead, RFC 2912 claims to depend on RFC 822,
where the definition of a comment is absolutely clear, so RFC 2912 would
have had no excuse at all for trying to modify it.

RFC 822 and 2822 did not deliberately leave open the possibility for
future header fields to redefine the comment syntax.  RFC 2912 does not
even discuss the fact that it attempts to redefine the comment syntax.
This is a fatal flaw in RFC 2912, and it's somewhat surprising that it
was not noticed before.

   Foobar: (& (Type="text/plain") (charset=US-ASCII) )

might or might not contain comments depending on the definition of
the Foobar header field.  I submit that

   Foobar: file:(=?us-ascii?q?=3D?=)

does not contain a comment.  It does have matched parentheses.  It
does not contain an RFC 2047 encoded-word and does not encode any
8-bit characters It does contain a syntactically valid absolute URI.

I submit that the RFC 2822 section 3.2.3 definition of a comment was
intended to apply to all header fields, including those defined in RFC
2822's future; that it was a mistake for RFC 2912's "Content-Features:"
header field to contain stuff that looks like a comment but is not
intended as a comment; and that it would be a mistake for the definition
of the Foobar: header field to try to say that the above example does
not contain a comment.

Other examples could be given, but the above show that it is necessary
to fully parse header field content in order to determine whether
or not there is an encoded-word; use of regular expressions (or the
equivalent) is inadequate.

I agree on this point.  However, lexical analysis plus some guessing
will often be good enough.

--apb (Alan Barrett)