ietf-822
[Top] [All Lists]

Re: RFC 2047 and gatewaying

2003-01-06 20:14:43

In 
<138AA78F80DCE84B8EE424399FFBF9C904F9CB(_at_)exchange(_dot_)ad(_dot_)skymv(_dot_)com>
 "Dan Kohn" <dan(_at_)dankohn(_dot_)com> writes:

Ned said:

Yawn. Heard all these arugments before back with MIME. This doesn't
address the backwards compatibility issue, and like it or not this is
an issue you are going to have to address.

This is perhaps Ned's most important statement.  At countless times in
the last 20 years, the IETF and it's predecessors have chosen backward
compatibility over the "efficient" solution.  The (IESG-approved) IDNA
drafts epitomize this approach, by promulgating an "ugly-looking"
transfer encoding (punycode) rather than the "elegant" solution of
UTF-8.  The economic calculation was that clients that care about i18n
can implement IDNA, but that the IETF would not break the "social
contract" made with DNS software that was (and now still will be)
compatible with pre-IDNA standards.

Yes, it's the tension between the "elegant" solution and the "ugly but
compatible solution". But there is the added problem that, every time you
adopt an "ugly" solution you make it yet harder again to use the "elegant"
solution next time, until eventually you reach a saturation point from
which there is no escape. That is the essence of the lecture Kai has been
giving us. From the view of the outsider, Email seems already to have
reached that point, and they want to protect Netnews from following that
same path. Again, they perceive the recent IDNA as the final nail in that
coffin.

However, the ills (perceived or real) of IDNA and Email are not my
problem, except insofar as Usefor have committeed me to ensuring that any
Email generated by Netnews is compliant, at some cost to gateways if
necessary.

Usefor wants to go the "elegant" route, insofar as contraints will allow.
There is a downside to this (e.g. in extra work for gateways); the
question is whether it is a better compromise. And what is a bad
compromise for one protocol is not necessarily so for another. Breaking
compatibility for Email could be a serious matter. Breaking compatibility
for Usenet is far far less so, and hence the elegant solution is in with a
better chance.

Similarly, a usefor solution that does not immediately cause existing
software (*including* gateways) to "break" on usefor messages, seems
1000 times more likely to pass IESG muster.  To quote Ned again:

The issue is instead that your approach in effect declares a large
body of software as being no longer compliant with long established
standards. This is something we try very hard not to do.

Which shows that people have a fundamental misunderstanding of the role of
gateways in Usenet. In the direction Email->News there are indeed lots of
general purpose gateways that expect to work with arbitrary emails and
convert them into articles in arbitrary groups. But gatewaying in that
direction is easy (relatively speaking).

But in the reverse direction News->Email, such general-purpose gateways
just Do Not Exist. Nobody is trying to take the whole of Usenet and
convert it into email. Instead, what you have is a large number of small
gateways each serving a very specific community of users - usually by
converting a single newsgroup into a mailing list. They tend to be written
by ad hoc means to serve their particular communities, which usually means
that they are concerned only with a single language and a single character
set. If the rest of Usenet suddenly starts using UTF-8, they will neither
notice nor care. If they do care, then they will lean on the adminisitrator
of their gateway and he will fix it.

The great myth that is being put around is that, overnight, myriads of
articles in strange character sets will suddenly descend on every Usenet
group. They won't. They will appear gradually, in groups where the
customary language will benefit from them. Users on the far side of
gateways will notice strange gobbledegook appearing. Either will they
ignore it, or they will flame the senders of it, or they will ask the
gateway owner to fix it. Said gateway owner will observe the there is a
new standard out, and will take steps to comply with it.

That is how change takes place on Usenet. We have been through it all
before, for example when RFC 2047 started appearing in headers, and then
wheh quoted-printable started appearing in bodies. No, that approach would
not work for Email (well, not as well) but on Usenet things are much more
relaxed. Moreover, Usenet has a built-in mechanism for propagating
information about the system itself (no, it does not work instantaneously,
but ultimately everybody on Usenet is connected to everybody else - things
get around).


Kai's argument is quite similar to the arguments made in IDN for "just
use UTF-8".  A strong consensus was reached that the elegance of UTF-8
was a far lower priority than backward compatibility.  For RFC 2822, the
value of backward compatibility over elegance was even more obvious.  If
you care about wasting bits in RFC 2047 encoding of UTF-8, start
worrying about the ATM cell tax of every DSL line, or the obnoxious
SONET overhead.  But if you want to deploy a standard in the IETF, focus
on backward compatibility.

As you say, it is a matter of balance. My point is that the balance on
Usenet is quite different, because it is a different sort of place.

I have a simple question.  What can a UTF-8 subject header communicate
that an RFC 2047 one can't?  Other than inelegance, what's the downside
of 2047, when the upside is a huge increase in backward compatibility?

No, it's just the inelegance. Plus the fact that the backward
compatibility issue is nowhere so huge as you imagine. In fact, it is
rather small.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 
Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

<Prev in Thread] Current Thread [Next in Thread>