ietf-822
[Top] [All Lists]

Re: Interpretation of RFC 2047

2002-12-18 10:23:09

Alan Barrett wrote:
On Tue, 17 Dec 2002, Bruce Lilly wrote:

One cannot recognise a comment unless the header field syntax is known.


One can recognise a comment from lexical analysis alone.  This was true
in RFC 822, and should still be true in RFC 2822 unless something went
wrong.
[...]
Other examples could be given, but the above show that it is necessary
to fully parse header field content in order to determine whether
or not there is an encoded-word; use of regular expressions (or the
equivalent) is inadequate.


I agree on this point.  However, lexical analysis plus some guessing
will often be good enough.

Lexical analysis is equivalent to using regular expressions (and hence
insufficiently powerful) -- indeed many lexical analyzers are build by
constructing a finite automaton from a set of regular expressions.