ietf
[Top] [All Lists]

Re: [idn] WG last call summary

2002-03-18 00:10:02
Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu writes:
Hmm.. so you're saying that *ALL* that code out there that
double-checked that things that claimed (possibly implicitly) to be
USASCII were in fact in the 0-127 range are "crusty" code?
Damn.  Sendmail 8.12.3.Beta1 is crusty - it actually bothers checking.

Time for some facts.

Sendmail, by default, does _not_ enforce the 0-127 restriction for mail
message headers. It allows bytes 160-255. Otherwise European users would
be dumping Sendmail even more quickly than they are today; ISO 8859-1
Subject lines are extremely popular.

Sendmail _does_ discard bytes 128-159 in mail message headers, because
it uses those bytes internally for its internal macro handling. Those
bytes aren't used in ISO 8859-1, but they are used in UTF-8. See
http://pi.cr.yp.to for a concrete example.

I sent Allman some email in February 1999 suggesting that he convert

   128 -> 255 160
   129 -> 255 161
   ...
   159 -> 255 191
   255 -> 255 255

with the opposite conversion on output. There have been several
security-fix releases of sendmail since then, so we could have had the
128-159 problem fixed on a huge number of machines. But he ignored the
suggestion. Apparently he doesn't care about international users.

People proposed more than a decade ago that the IETF require 8-bit-clean
mail software. (See, for example, Andre Pirard's ietf-smtp message dated
Tue, 19 Feb 91 12:08:00 +0100.) The only objection to this requirement
was the claim that 8-bit support would take a long time to be deployed.
Paul Vixie said that he had some seven-year-old sendmail binaries, for
example, and concluded ``with near-certainty'' that ``any changes to the
SMTP spec will take at least a decade to reach 90% of the critical
server population.''

In fact, it took less than a decade for every critical server to add
support for 8-bit message bodies, even though the IETF _still_ doesn't
require this. If the SMTP specification had been changed in 1991 to
require transparent 8-bit handling in both the header and the body, we
wouldn't have Sendmail's UTF-8 problems today.

Sendmail's continued data corruption is an embarrassment to the Sendmail
company. The fact that RFC 2821 and RFC 2822 allow this garbage is an
embarrassment to the IETF.

---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago



<Prev in Thread] Current Thread [Next in Thread>