Transmission issues for transition to UTF-8 Headers
1999-02-12 04:27:48
Folks,
I've changed the subject field so that matters about UA display, and the
like, can be treated separately from the meta-question of interacting email
services, namely what is the basic approach that should be used when a
UTF-8 host wants to sent to some other host.
This subject line is intended to focus just on the question of coordination
between sending and receiving systems, in other words on the exchange
protocol, SMTP, and the object being exchanged, in this case 822 or 822bis
since headers are the issue.
This particular community has quite a large amount of experience making
making changes to a a very installed base. It has upgraded that base more
than once. The recent changes involving MIME and ESMTP represent an
extraordinarily successful piece of work, and we should not have to
re-invent, experiment, or otherwise spend a great deal of time on a topic
that is quite clearly a repeat of these earlier times.
The transition to UTF-8 headers has all of the characteristics of the
transition to MIME and ESMTP and we should use what we've learned.
What we've learned:
1. If you insist on "just sending" the new stuff, without getting a
statement of support by the receiver, then you MUST send it in a fashion
which is safe for old recipients. That is the model used for MIME and that
is the reason that MIME has such a horrendous appearance, as well as such a
safe record of deployment. A computer scientist shudders at MIME's
ugliness. An engineer marvels at its successful deployment over an
existing, global base of users.
2. If there is any real chance of impact on the recipient, such as
breaking their software, the sender MUST get permission before initiating a
'new' behavior. That is what ESMTP options are for. They work
dandy. Since 8-bit encoding is well-known to be a source of problems for
recipients not ready for it, it is clear that sending 8-bit headers needs
either to have 7-bit encoding or explicit recipient go-ahead. We fought
the "just send 8 bits" wars years ago. We should not have to fight them again.
3. Unlabeled defaults tend to be problematic, since there is no detecting
when they are changed. While it offends one's sense of "efficiency" it's
worth the extra bits to label a long-lived object explicitly.
At 01:50 AM 2/5/99 +0000, D. J. Bernstein wrote:
The short-term goal is to allow messages with unencoded UTF-8 in the
4. When making a transition for a global installed base (and actually the
rule applies equally for a much smaller base) there is no semantic content
to the term "near-term".
Even the smallest change is measured in years. The longest in
decades. Hence, serious efforts at transition must design a "transition"
mode which is capable of being used permanently.
MIME's 7-bit encoding is an example. While one might claim that 8bitmime
is a disaster, that claim misses the larger fact that we exchange binary
data with each other today quite comfortably but could not do so before the
MIME effort was started. The fact that we send it around "inefficiently"
is irritating to some, but is clearly secondary to the fact that we can and
do use the exchange mechanism successfully.
The long-term goal is to eliminate the implementation burden of multiple
character sets and =??=. This takes one extra step:
(4) Mail writers have to convert all outgoing messages from the local
character set to unencoded UTF-8.
Eventually all character-set markers can be removed.
5. Eventually, COBOL will disappear. Eventually, no one will ever program
in assembly language. Eventually, we will have world peace...
So how do we factor in "eventually" to practical planning efforts?
Chris Newman writes:
* Create UTF8HEADER SMTP extension. Provides RFC 2047 downgrading for
both top level headers and nested message/rfc822 headers.
That doesn't survive a cost-benefit analysis.
Right. Instead it just survives the test of global, practical demonstration.
In other words, the alternative doesn't survive a credibility analysis,
unless you want to declare the world's movement to X.400-based UA's a success.
Your conversion is safe in a fantasy world where all message readers
"fantasy"?
gosh.
why would anyone find it distasteful to interact with such a
participant? I just can't imagine.
ANYHOW...
For moving to UTF-8 headers, we need an esmtp option and, I suppose, an
822(bis) header.
d/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
PRESIDENTS DAY OPEN HOUSE, 2/13
<http://www.brandenburg.com/misc/presday/presday-invite.gif>
Dave Crocker Tel: +60 (19) 3299 445
<mailto:dcrocker(_at_)brandenburg(_dot_)com> Post Office Box 296,
U.P.M.
Serdang, Selangor 43400 MALAYSIA
Brandenburg Consulting
<http://www.brandenburg.com> Tel: +1 (408) 246 8253
Fax: +1(408)273 6464 675 Spruce Dr., Sunnyvale, CA 94086 USA
|
|