[Top] [All Lists]

Re: [idn] Re: 7 bits forever!

2002-04-03 13:54:33

From: "Keith Moore" <moore(_at_)cs(_dot_)utk(_dot_)edu>
the old headers will not "only" contain ACE - they will contain a mixture
of plain ASCII, addresses with ACE domains and ASCII local parts,
addresses with ACE local parts and ASCII domains, and pure ACE.
Even if only the ASCII components are changed this still changes the
information in the header.

But technically, ACE names are ASCII names.  How are they supposed to be
"changed".  Please enlighten me.

In this context I'm using ASCII here to mean names that don't use ACE.
(sorry, I thought this would be obvious)

to give a concrete example, many MTAs can rewrite some addresses in 
message headers - e.g. to hide internal domain names or to rewrite
obsolete domain names to correct ones.

Somebody sends a mail with the following header field:

To: joe(_at_)one(_dot_)example(_dot_)com, 
ACE-1(_at_)two(_dot_)example(_dot_)com, sue(_at_)ACE-2, ACE-3(_at_)ACE-4

(I'm using ACE-x as a shorthand notation for a IDN or a email local-part
that's encoding using ACE)

now say that along with that To field the sender's MUA includes some other 
X-UTF-8-To field that spells everything out in raw utf-8.

the message then passes through an MTA that's configured to rewrite into  The to field is then:

To: joe(_at_)example(_dot_)com, ACE-1(_at_)example(_dot_)com, sue(_at_)ACE-2, 

but the MTA doesn't know about the X-UTF-8-To field, so it doesn't get

if the recipient MUA looks at the X-UTF-8-To field, and tries to reply
to those addresses, the reply fails because the addresses
aren't reachable outside of's enterprise network.

the only way for the recipient's MUA to be able to use the X-UTF-8-To field 
is to compare it with the To field (decoding the ACE components) and see 
whether the two domains are consistent.

If I understand MIME correctly, even "alternate sections" are allowed.  

mime provides a means of including alternate representations of contents
(essentially, attachments) - not headers.

How is that permitted and not a UTF8 name? 

first, having alternate representations of contents (say documents) is
clearly useful and doesn't cause as many problems as having alternate 
representations of addresses in message headers.  

second, MIME has no provision to encode the same contents in different 
ways (say, one in base64 and one in quoted-printable) - since all MIME
implementations have to support both base64 and quoted-printable, there
would be absolutely no point to this.  similarly, since all MUAs that
supported IDNs would have to support ACE addresses for reasons of 
compatibility there's absolutely no point to having an alternate 
representation of the same string in pure utf-8.

third, the assumption behind multipart/ alternative is that the 
alternatives are actually different in some way (say one is plain 
text, the other PDF) and that there's a clear notion of which one 
is "better" indicated by the order of the alternatives within the
multipart.  with ACE vs. utf-8 neither one is "better" - they both
contain the same information.

fourth, at the time we wrote MIME there were relatively few MTAs that 
would rewrite message bodies in comparison to those that would rewrite
message headers.  (these days there are firewalls that do modify 
message bodies, some of which can modify a component of a multipart).
so the rewriting argument didn't seem relevant at the time, it might
be more relevant today.

frankly, even multipart/alternative has turned out to be more problematic
than we anticipated - for example, it's hard to represent a multi-dimensional
choice in such a way that the recipient's MUA can make the right decision.
and its principal use seems to be to include HTML in email which when 
displayed produces exactly the same information as plain text.  if when we 
were writing MIME we had known what we know today, we might not have 
bothered with multipart/alternative at all.  
Seems to me like using a UTF8-only
name makes a lot of sense, just like a multipart mail that have both a text
version and an html version.

it only seems that way if you ignore how they're going to interact with
existing software.

no, dual representations are a bad idea.

Then MIME content type for multiparts is also a bad idea according to you I

bottom line: they were made to solve a very different problem than the one 
you're talking about.  


<Prev in Thread] Current Thread [Next in Thread>