Re: Dual names, IDN and ASCII, in e-mail addresses?


Bruce Lilly wrote:

A named group may suffice where the syntax permits an address
rather than a single mailbox, e.g.:

  Reply-To: Bruce Lilly: <blilly(_at_)erols(_dot_)com>, 
<blilly(_at_)verizon(_dot_)net>;

[obviously the phrase could use RFC 2047 encoded-words, and
an IDN could be used in any of the group mailboxes]


Unfortunately this tells the recipient to send the reply to both
addresses, which is not what the sender really wants.

IDNA and IMAA will allow an address to be compatible with both
ASCII-only software and Unicode-enabled software.  But they won't
allow an address to be user-friendly for both English-only users and
no-English users.  As messages get copied, forwarded, and replied to,
addresses get copied from one header field to another, and there's no
telling who will end up looking at them.  That is the motivation for
trying to think of a way for two addresses to be paired up and copied
together, so that both remain available for display.

On the other hand, the body of the message will typically not be
user-friendly to both English-only users and no-English users,
so any effort to support dual addresses might be mostly wasted
anyway.  (Although the body could be non-textual, or could be a
multipart/alternative with translations into multiple languages.)

Anyway, here's yet another idea:

From: "=?iso-8859-1?q?G=F6ran=20M=FCller?= / Goran Muller
 <Goran(_at_)Muller(_dot_)de>" <Gran0iesg15qa(_at_)xn--Mller-kva(_dot_)de>

Ancient software will display it exactly as above, while MIME-enabled
IMAA-enabled software will display it as:

From: "Göran Müller / Goran Muller
 <Goran(_at_)Muller(_dot_)de>" <Göran(_at_)Müller(_dot_)de>

I think this will survive replies, and will survive getting copied
into a message body.  In the latter case, the ASCII parts will get
copied verbatim, and the non-ASCII parts will get copied either in
their display form or their encoded form.  In any case, both forms will
be available to the recipient to view and to paste back into message
headers.

Keith Moore wrote:

Nothing stops Jacob or anyone else from defining a new type of
body part that supports text of multiple character encodings.  The
challenge is in making it technically sound and enough of a win over
HTML (which is already widely supported) that people want to implement
it and use it.


Note that HTML doesn't allow mixing of multiple character encodings.  It
allows mixing arbitrary Unicode characters together with *one* character
encoding.  That's more than you can do with text/plain, but less than
the multi-charset support that Jacob was hoping for.

Jacob Palme <jpalme(_at_)dsv(_dot_)su(_dot_)se> replied:

Are not Japanese people already doing this in order to allow Japanese
e-mail to contain a mixture between English and Japanese text, by
using ISO 2022 character set switch marks in the text.


They use iso-2022-jp, which is a single character encoding (charset)
that uses a few ISO 2022 escape sequences to switch between exactly four
coded character sets (ASCII and three Japanese sets).

If you want to support a great number of coded character sets, I guess
one approach would be to define a new media type that doesn't use a
charset parameter, and instead provides its own internal charset tagging
(so it would not merely support many coded character sets, but many
character encodings).  Another approach is to create a general iso-2022
charset that includes all registered 2022-compatible coded character
sets.  Either way there would be a deployment hurdle to overcome.

AMC