ietf-822
[Top] [All Lists]

Re: Interpretation of RFC 2047

2002-12-18 10:30:15


On Wed, 18 Dec 2002, Keith Moore wrote:
One can recognise a comment from lexical analysis alone.

comments are only valid in structured fields.  so in order to
recognize a comment you have to know the set of structured fields.

Yes, that's true (both in RFC 822 section 3.1.3 and RFC 2822 section
2.2.1).

Does anybody claim that that RFC 2912 "Content-Encoding" is an
unstructured field?

I think you meant content-features, not content-encoding. In any case, I
suppose one of the ways out of this is indeed to call this an unstructured
field as far as RFC 2822 is concerned; it certainly has been done before.

and it is (perhaps unfortunately) the cases that some fields have
a syntax that uses parenthesis as other than comment delimiters.
if I'm not mistaken this has been the case ever since rfc 987,
which used constructs like (a) to order to encode things like @ in
PrintableString fields.

By my reading of RFC 987, if a PrintableString is used in a context
where something like unquoted "(a)" could be misinterpreted as a
comment, then the entire PrintableString must be further encoded in an
RFC 822 quoted-string.  See the second paragraph on page 58 of RFC 987,
where it says "word may be encoded as 822.atom (which has a restricted
character set) or as 822.quoted-string, which can handle all ASCII
characters."

Right idea, wrong specific item and wrong RFC. RFC 987 is obsolete; the current
version is RFC 2156. And the case in RFC 2156 where parentheses are used
isn't the encoding of printablestrings, it is the encoding of object
identifiers defined in section 3.3.7:

    joint-iso-ccitt(2) mhs (6) ipms (1) ep (11) ia5-text (0) 

RFC 2156 attempts to finess the conflict with RFC 822 comments by claiming this
is only used in unstructured fields as far as RFC 822 is concerned. This always
struck me as a bit of a stretch, however; several of the fields, such as
original-encoded-information-types clearly contain structured information.

I submit that the RFC 2822 section 3.2.3 definition of a comment was
intended to apply to all header fields

I don't think so - that would break too many things already in
existence.

OK, all structured header fields.  The entire lexical analyser described
in RFC 822 section 3.1.4, and RFC 2822 section 3.2 (read in conjunction
with section 2.2.2), seems to be intended to apply to all structured
header fields.  I have always assumed that this included any structured
fields that might be defined in the future.

Apart from RFC 2912 Content-Encoding, what else violates this assumption?

I don't know of any offhand, but given that there are already several examples,
it seems wise to assume there might be more. I therefore question the
appropriateness of making this assumption and certainly do not make it
in the code I write.

                                Ned