ietf-822
[Top] [All Lists]

Re: RFC 2047 and gatewaying

2003-01-03 11:15:39


In <01KQR2TXIGII009OMN(_at_)mauve(_dot_)mrochek(_dot_)com> 
ned+ietf-822(_at_)mrochek(_dot_)com writes:

Clearly, strict RFC [2]822 compliance is not a requirement, and use for
News articles is explicitly encouraged (which is why Usefor makes it the
official way to encapsulate News).

Strict conformance is not a requirement, but restriction of the headers
to US-ASCII is most certainly a requirement. See the third paragraph
of the section.

It is a requirement for a "message/rfc822" that is to be conveyed as an
email in accordance with RFC 204[56]. No quarrel with that, and the
wording I proposed will bring it about.

More sophistry. The type is clearly defined and the rules for its content
exist independent of the transport being used to carry it around.

Time to stand back and give a broader picture of what Usefor thinks it is
trying to do.

1. Netnews is to be regarded fundamentally as an 8-bit clean medium. It is
a _different_ medium than Email, although there are strong links that must
be preserved.

2. But in fact, if you look at the headers we have defined so far, you
will see that the only places 8bit freedom is allowed is in comments,
phrases, parameters and newsgroup-names. That may not be so as new headers
are defined in future extensions, but it is where we are at now.

3. There is a choice, for comments, phrases and parameters. Either they
may be presented as raw UTF-8 _or_ they may be encoded using RFC
2047/2231. Conformant reading agents MUST support both. The document takes
a neutral position as to which should be used (ultimately market forces
will decide). It does point out that some existing software will break
whichever is used.

This is going to be a tough sell in and of itself. People are going to ask why
you didn't pick a single scheme and stick to it.

4. The canonical form of newsgroup-names MUST be UTF-8. A special encoding
is provided for transient use during moderation and for other specific
purposes.

5. Insofar as these choices conflict with practice in the Email world,
there was a consciously taken decision to compress any resultant messiness
into the gateways. The number of these is small compared to the vast
number of Unsnet servers worldwide and the even vaster number of clients.

Yawn. Heard all these arugments before back with MIME. This doesn't
address the backwards compatibility issue, and like it or not this is
an issue you are going to have to address.

6. MIME is a set of protocols defined for the Email world, as is
explicitly stated in RFC 2045. Officially speaking, there is no such thing
as MIME within Netnews (which accounts for its extremely poor uptake
within Usenet).

Wrong on several counts. First, MIME is a format in addition to being a
protocol. This has serious implications in regards to how tightly you can draw
boundaries around a particular use of MIME in a specific application transport.

Second, while MIME was originally defined and targetted at email, it lost that
focus long ago -- certainly before RFCs 2045-2049 came out. The current MIME
documents try to make it clear what parts are email-specific and what parts are
more generally applicable. And media types are one of things that are more
generally applicable than just to email.

7. Usefor rectifies this by defining the usage of the MIME protocols
within Netnews.

And that's a perfectly legitimate thing to do. But you cannot break the
definition of existing media type as part of this process. Pick another media
type, for heaven's sake!

And to the extent that netnews is not a world unto itself, you have to take the
effects of the sort of MIME you select on other things into account.

8. Insofar as MIME introduces the use of headers within the bodies of
articles/messages, it is only to be expected that within Netnews the
general conventions regarding Netnews headers should apply to them. Again,
any resulting messiness is to be confined to the gateways.
 
And that's fine. Just don't reuse an existing media type in the process.

9. In fact, there are only two places where MIME in Netnews will differ
from MIME in Email. One is in allowing full UTF-8 within quoted-strings
within parameters in the Content-* headers (and probably in comments
within those headers too). The other is in "message/rfc822". Both these
cases therefore require attention by gateways.

10. RFC 1036 is now so far behind the times that it provides but a poor
description of the Usnet of today. Considerable developments in practice
have taken place since that time, all without the backup of formal
standards. Some of these developments have worked well. Others have been
haphazard and are going to be hard to undo. This is why the WG sought, and
was given in its charter, permission to include extensions to the existing
protocols.

I would refrain from arguing on the basis of what your charter currently says
if I were you. A WG whose current charter includes no goals and milestones
position in the IETF can only be described as "precarious". At a minimum a
rechartering exercise could be in order, and given recent experience with
charters and the IESG the resulting charter that would emerge would be very
likely to constrain you in ways you really would not like.

11. Insofar as our extensions introduce new and desirable functionality
into Usenet, it is accepted that the benefits will be seen only as upgraded
user agents come into use. Those users who care for the new functionality
will take the trouble to upgrade. Those who do not will struggle on as they
are. Nevertheless, we have taken trouble to ensure that the present
transport backbone of Usenet will work as it stands (though it will work
better with a little tweaking).

Second, to the extent that the article format is extended to allow utf-8 in
headers, you have to explain how the many gateways of the world between 
netnews
and email, gateways which are good faith implementation of various RFCs, are
going to be able to accomodate the changes you have made and not cause damage
to the infrastructure when they are presented with something they do not
expect. If you cannot come up with a satisfactory explanation of how this
damage will be minimized past history indicates that you will not be allowed 
to
make such changes in an RFC.

It is already the case that 2.5% [1] of Usenet articles use raw Non-ASCII
characters in their headers. I have not heard of any damage or chaos
arising from gateways that cannot cope. The sky has not fallen in. No
doubt some very odd-looking headers have emerged from the email side of
those gateways and, by and large, users have ignored them and Just Hit
DELETE.

Strawman argument since nobody is saying the sky will fall in. This was a
fairly serious issue at the time MIME was developed but I see no evidence it is
one any longer. The existance of vast amounts of illegally formatted mail has
more or less insured that everything out there can cope with 8bit material in
every possible part of a message. Of course coping doesn't imply that things
work correctly; clearly they frequently do not.

The issue is instead that your approach in effect declares a large body of
software as being no longer compliant with long established standards. This
is something we try very hard not to do.

The Bad News, of course, is that these headers are mostly not in UTF-8.
They are in every character set imagineable (and often with the explicit
approval of the administrators of their hierarchies). So such articles
will not be compliant with Usefor, but Usefor has had to take note of the
problem and to suggest a workaround.

And that problem is not limited to Netnews. 30% of the Email that I
receive has raw Non-ASCII in its headers. I have to do a lot of Hitting
DELETE :-( .

It is a useful criteria for detecting spam, isn't it? That alone should
give you pause...

[1] That 2.5% figure is from memory. If you want the exact figure (the
count has indeed been done) I will have to delve back into the Usefor
archives. I suspect it was actually higher than 2.5%.

The argument of "but everyone breaks the rules so we can too" has been made
countless times. I cannot recall a single case, however, where such an argument
made it past the IESG.

The question you need to be asking is whether or not coming up with something
that is extremely unlikely to ever be approved as-is and which it is very
likely will have to be completely redesigned is a useful way for the WG to
spend its time.

I am not expecting getting Usefor past the IETF to be an easy task. But
then getting Usefore to make up its collective mind was not an easy task
either :-( .

Well, you're clearly set on this course and it seems unlikely that anything I
say is going to change it. So this wil be my final response on this general set
of issues.

                                Ned