ietf
[Top] [All Lists]

Re: [idn] Re: 7 bits forever!

2002-03-22 10:30:04
Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu writes:
you could *NOT* trust that all the systems
between here and there were 8-bit-clean

I'm not saying that Quoted-Printable had no short-term benefits for its
short-term costs. I'm saying that, viewed from our long-term perspective
eleven years later, the failure to require 8-bit transparency was an
amazingly stupid decision.

If Keith Moore, Dave Crocker, Paul Vixie, et al. hadn't been so blind,
they would have required 8-bit transparency for the long term, and then
evaluated the costs and benefits of short-term 7-bit kludges in that
framework.

Probably the result would have been long-term 8-bit with no short-term
kludges. Conceivably it would have been long-term 8-bit plus optional
short-term Quoted-Printable. Either way, it would have been vastly
better than what actually happened.

Do you want to be having this discussion again in twenty years, with
8-bit problems still not fixed?

there were a *LARGE* number of systems that broke badly if they
were handed 8 bit data.

Let's look at the facts. John Klensin claimed in an ietf-smtp message
dated 26 Feb 91 08:40:04-EST that there were mail servers ``not robust
against that particular form of misbehavior.'' Robert Ullmann publicly
asked for proof of this claim. Klensin dodged the question.

Similarly, Keith Moore claimed in a comp.mail.mime message, message ID
199710102108(_dot_)RAA13469(_at_)spot(_dot_)cs(_dot_)utk(_dot_)edu, that 
``core-dumping was a
commonly observed failure mode in the early 1990s.'' I publicly asked
for proof of this claim. Moore dodged the question.

Mail servers discarding characters? Yes. Mail servers stripping the 8th
bit? Yes. Mail servers crashing? Not a single shred of evidence.

Similarly, expanding from mail to all protocols: Rick Wesson claimed in
an IDN WG message dated Sun, 24 Dec 2000 16:44:39 -0800 that ``there is
a lot of embedded systems out there that would crash-and-burn if they
received a reply in utf8.'' I asked for proof:

   Can you please identify the systems, explain how they use domain
   names, and say what exactly you mean by ``crash-and-burn''? We need
   this information if we're going to accurately assess the cost of
   upgrading the world to support IDNs.

Naturally, Wesson dodged the question.

I will readily agree that there has been an unverified report of a UTF-8
crash of an obsolete version of the Netscape mailer under Solaris. If
that report is accurate then those users will have to upgrade.

BIND, which by default restricts it
  [ ... ]
Why does it get restricted?
  [ bogus rationalization snipped ]

The actual history, as I mentioned in another message, is as follows.

People discovered several years ago that sendmail would blindly feed DNS
PTR results to the shell, so attackers could take over the computer by
putting some special characters, such as |<>, into PTR records. The BIND
people panicked and disabled all non-letter-digit-hyphen characters at
every spot they could think of in their DNS client library.

This isn't an 8-bit issue; it does just as much damage to underscores.

Let's take as an example the "native language" encoding of my name:
From: Valdis Kl=?iso8859-4?Q?=BA?=tnieks 
<Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu>

Wow. How do you pronounce that? ``Hi, I'm Valdis Klee-kwals-question-
mark-iso-eighty-eight-fifty-nine-dash-four-question-mark-kyoo-question-
mark-equals-bah-question-mark-equal-stun-ieks''? Have you considered
changing your name?

In all seriousness: Wouldn't you like to see a world where the same
character encoding is used for the name and the address and the message
body and so on, so that simple copying doesn't screw up the display?

---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago



<Prev in Thread] Current Thread [Next in Thread>