[Top] [All Lists]

Re: Compatibility wit Netnews

2007-04-29 16:11:07

Charles Lindsey wrote:

 [canonical Message-ID]
The essential feature required for Netnews was that, in order to
compare two <mesg-id>s for equality, a simple octet-by-octet
comparison would always suffice. Such comparisons are at the very
heart of the Netnews protocol, though they hardly arise at all in
email except for the purpose of threading articles in MUAs using
the <msg-id>s that occur in References header fields. Even there,
I suspect, many threaders have simply implemented an octet-by-octet

+1  And it's not only about "comparing" Message-IDs, it's also about
using something that's allowed in NNTP, i.e. no raw control char.s
(= NO-WS-CTL), and no ">" even within <no-fold-quote> (quoted LHS) or
<no-fold-literal> (domain literal as RHS).

Another application are Message-IDs in mailing list archives used as
Archived-AT URIs, or Message-IDs in NetNews used in news: URIs.  It's
possible to percent-encode any NO-WS-CTL, but it would be a pain.

Percent-encoding is always a kludge and harms interoperability as
soon as applications try to percent-encode something that is already
partially percent-encoded for local purposes, and then let another
application try to undo this.  The PURL resolver is an example how
that can get too easily out of hand.

 [magic SP]
The other principle difference is that, in each header field, an
SP is obligatory after the ":". Since this is already the universal
practice within email, I would suggest that there should be no
problem in REQUIRING the same in the case of email, and to banish
the possibility of omitting that SP to the obs-syntax.

I'd support a mandatory FWS after the colon.  But I can't support a
"magic SP" also in mail, IMO any significant "trailing white space"
is harmful.  Let's take an example header field:

   foo = "foo:" [CFWS] bar [CFWS] CRLF

Adding a magic SP results in:

   foo = "foo:" SP [CFWS] bar [CFWS] CRLF

Now the following folding would be syntactically wrong:

| foo:<CRLF>
| <WSP>bar<CRLF>

The NetNews RFC guarantees that it's never allowed to fold a header
field before the first non-blank character in the header field body,
here <bar>.  But 2822 doesn't guarantee this, it only says that a
folded line containing _only_ trailing white space isn't allowed.

And it's okay that 2822 doesn't guarantee this, sometimes it's very
convenient to fold a header field directly at the colon.  Extremely
long 2047-encoded words are an example, it would be odd to have a
special rule for the first line of a folded header field depending
on the length of the field name.

So I'd propose to add a mandory FWS to unstructured fields and the
<date-time>, i.e. all header field bodies not starting with [CFWS].

All other header field bodies start with [CFWS], and we could make
it mandatory, for the example above:

   foo = "foo:" CFWS bar [CFWS] CRLF

A magic SP matches CFWS, you can do what you claim everybody does
already.  An initial folding also matches CFWS, theerfore I can
fold header fields directly after the colon.  Admittedly this
still allows silly constructs:

| foo:(ugh)bar<CRLF>

OTOH it's no new ugliness, 2822 already allows this.  In essence
the only ugliness allowed by 2822 which would be "eliminated" by
a mandatory CFWS is this:

| foo:bar<CRLF>

Life would be much simpler if Email could now be brought into
line with Netnews in these regards (the present practices would
just move to the obs-syntax, of course).

USEFOR had compelling reasons why it didn't copy all 2822 ideas.
I think the opposite is also true.