ietf-822
[Top] [All Lists]

Re: IDN (was Did anyone tell Microsoft yet?)

2002-04-29 13:45:11

In <01KH0Y7S8GO200019Q(_at_)mauve(_dot_)mrochek(_dot_)com> 
ned+ietf-822(_at_)mrochek(_dot_)com writes:

Has anybody looked into the transitional issues that would arise if Email
was to be moved into allowing UTF-8 headers?

Yes. There are a number of sticking points, but on the whole it is doable.
The real question is on what scale the benefits outweigh the costs.

AIUI there are two sets of problems.
        Transport Agents
        User Agents

The problem for transport agents (which inlcudes POP servers, and the
like) is to ensure that the UTF-8 characters are correctly delivered. Such
agents are not usually concerned with the content of the headers except
for addresses and maybe dates. They are going to have to be fixed in any
case so that addresses use IDNA/whatever. Dates should be left strictly in
ASCII. So the real question is munging, tuncation of 8th bit, etc.

Now, AIUI, most transports are munge free. All 8 bits get through. The
only exception I am aware of is sendmail, which is said to use the 8th bit
of some headers for internal flags. Does anyone know more detail of this
(it certainly seems to be able to accomodate 8 bits in the Subject line
AFAICS)?

However, there already exist downgrading mechanisms for use if needed. RFC
2047 for unstructureds and comments. RFC 2231 for parameters.
IDNA/whatever for addresses. What else is there? Clearly, you define that
mesg-ids, tokens, header-names remain in ASCII (that is what Usefor has
done). What problems remain?

OTOH, expecting transports to downgrade may be a forlorn hope. Transports
which advertise 8BITMIME are supposed to downgrade when forwarding to one
that doesn't, but few actually do it. This does not appear to have caused
the sky to fall in, however.


User agents have different problems, since they need to display headers,
and also to understand them for various purposes. Clearly, existing user
agents are going to be around for much longer than existing transport
agents. OTOH, one can take the view (as Usefor does) that, if you expect
to receive communications in Chinese (whatever), you make sure you upgrade
to a User Agent with the necessary capability. If you don't expect to
receive such communications, then you don't bother. The worst that then
happens is that you see some gibberish from time to time. Well, actually,
I see such gibberish all the time from Korean spam, but I don't regard
that as a bug in my User Agent, but rather as a bug in the Internet in
Korea :-( .

As to when the time would be ripe to make such a change, I think we need
to realize that we are going to HAVE to make some changes to accomodate
IDNA pretty soon now. For sure, as soon as IDNA becomes an RFC, people
will expect these domain names to suddenly start working in email (and
they will, or course be diasppointed, but that won't stop all sorts of ad
hoc fixes from appearing).

So we need to know now where Email is going on the IDNA front, and if
there are going to be changes to accomodate IDNA, then one might as well
look into the UTF-8 change at the same time. One upheaval is always better
than two.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 
Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5