ietf-822
[Top] [All Lists]

Re: RFC 2047 and gatewaying

2003-01-03 10:15:20

In <01KQR2TXIGII009OMN(_at_)mauve(_dot_)mrochek(_dot_)com> 
ned+ietf-822(_at_)mrochek(_dot_)com writes:

Clearly, strict RFC [2]822 compliance is not a requirement, and use for
News articles is explicitly encouraged (which is why Usefor makes it the
official way to encapsulate News).

Strict conformance is not a requirement, but restriction of the headers
to US-ASCII is most certainly a requirement. See the third paragraph
of the section.

It is a requirement for a "message/rfc822" that is to be conveyed as an
email in accordance with RFC 204[56]. No quarrel with that, and the
wording I proposed will bring it about.

Time to stand back and give a broader picture of what Usefor thinks it is
trying to do.

1. Netnews is to be regarded fundamentally as an 8-bit clean medium. It is
a _different_ medium than Email, although there are strong links that must
be preserved.

2. But in fact, if you look at the headers we have defined so far, you
will see that the only places 8bit freedom is allowed is in comments,
phrases, parameters and newsgroup-names. That may not be so as new headers
are defined in future extensions, but it is where we are at now.

3. There is a choice, for comments, phrases and parameters. Either they
may be presented as raw UTF-8 _or_ they may be encoded using RFC
2047/2231. Conformant reading agents MUST support both. The document takes
a neutral position as to which should be used (ultimately market forces
will decide). It does point out that some existing software will break
whichever is used.

4. The canonical form of newsgroup-names MUST be UTF-8. A special encoding
is provided for transient use during moderation and for other specific
purposes.

5. Insofar as these choices conflict with practice in the Email world,
there was a consciously taken decision to compress any resultant messiness
into the gateways. The number of these is small compared to the vast
number of Unsnet servers worldwide and the even vaster number of clients.

6. MIME is a set of protocols defined for the Email world, as is
explicitly stated in RFC 2045. Officially speaking, there is no such thing
as MIME within Netnews (which accounts for its extremely poor uptake
within Usenet).

7. Usefor rectifies this by defining the usage of the MIME protocols
within Netnews.

8. Insofar as MIME introduces the use of headers within the bodies of
articles/messages, it is only to be expected that within Netnews the
general conventions regarding Netnews headers should apply to them. Again,
any resulting messiness is to be confined to the gateways.
 
9. In fact, there are only two places where MIME in Netnews will differ
from MIME in Email. One is in allowing full UTF-8 within quoted-strings
within parameters in the Content-* headers (and probably in comments
within those headers too). The other is in "message/rfc822". Both these
cases therefore require attention by gateways.

10. RFC 1036 is now so far behind the times that it provides but a poor
description of the Usnet of today. Considerable developments in practice
have taken place since that time, all without the backup of formal
standards. Some of these developments have worked well. Others have been
haphazard and are going to be hard to undo. This is why the WG sought, and
was given in its charter, permission to include extensions to the existing
protocols.

11. Insofar as our extensions introduce new and desirable functionality
into Usenet, it is accepted that the benefits will be seen only as upgraded
user agents come into use. Those users who care for the new functionality
will take the trouble to upgrade. Those who do not will struggle on as they
are. Nevertheless, we have taken trouble to ensure that the present
transport backbone of Usenet will work as it stands (though it will work
better with a little tweaking).



Second, to the extent that the article format is extended to allow utf-8 in
headers, you have to explain how the many gateways of the world between netnews
and email, gateways which are good faith implementation of various RFCs, are
going to be able to accomodate the changes you have made and not cause damage
to the infrastructure when they are presented with something they do not
expect. If you cannot come up with a satisfactory explanation of how this
damage will be minimized past history indicates that you will not be allowed to
make such changes in an RFC.

It is already the case that 2.5% [1] of Usenet articles use raw Non-ASCII
characters in their headers. I have not heard of any damage or chaos
arising from gateways that cannot cope. The sky has not fallen in. No
doubt some very odd-looking headers have emerged from the email side of
those gateways and, by and large, users have ignored them and Just Hit
DELETE.

The Bad News, of course, is that these headers are mostly not in UTF-8.
They are in every character set imagineable (and often with the explicit
approval of the administrators of their hierarchies). So such articles
will not be compliant with Usefor, but Usefor has had to take note of the
problem and to suggest a workaround.

And that problem is not limited to Netnews. 30% of the Email that I
receive has raw Non-ASCII in its headers. I have to do a lot of Hitting
DELETE :-( .

[1] That 2.5% figure is from memory. If you want the exact figure (the
count has indeed been done) I will have to delve back into the Usefor
archives. I suspect it was actually higher than 2.5%.


The question you need to be asking is whether or not coming up with something
that is extremely unlikely to ever be approved as-is and which it is very
likely will have to be completely redesigned is a useful way for the WG to
spend its time.

I am not expecting getting Usefor past the IETF to be an easy task. But
then getting Usefore to make up its collective mind was not an easy task
either :-( .

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 
Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

<Prev in Thread] Current Thread [Next in Thread>