Re: Comment on the draft MIME Part 1 document

To:  ietf-822(_at_)dimacs(_dot_)rutgers(_dot_)edu
Subject:  Re: Comment on the draft MIME Part 1 document
Date:  Wed, 28 Apr 93 18:21:41 +0200

Excerpts from Keith Moore's message Tue, 27 Apr 1993 21:57:48 -0400:

We can't prevent people from using 8-bit characters as login names.
However, if they use that login name for an email address, they
won't be able to receive mail at all from many sites around the
world.  My perception is that this practice is not nearly as
widespread as that of sending 8-bit body parts.

Some mail systems for PC LANs will gladly accept non-ASCII
characters for user names and mailboxes, e.g. Microsoft Mail for
Macintosh.  In non-English-speaking countries such capabilities
will be used.


Yes, but the vast majority of the world can't send them email.

A great deal of the power of email is that everybody can talk to
everybody else...this is why we tend to connect dissimilar systems
together in the first place.  We lose this as soon as we allow
addresses that aren't usable by everyone.

That's not to say that we cannot recommend a way of encoding
addresses as ASCII characters.  (No, the RFC 1342 encoding is NOT
appropriate.)


The RFC 1342 _approach_ is appropriate, though: Decide a new
interpretation of legal but seldomly used character sequences in
addresses as _representing_ non-ASCII characters.

+  Old mail software can handle addresses chosen to represent
   non-ASCII characters with no problems.  They will however be
   cryptic to human users.

+  New mail software will display such addresses using the
   intended non-ASCII characters.  Human users will be able to
   read and, in important cases, remember such addresses with
   the same ease as pure ASCII addresses.


I don't think this is a good idea.

It's important that the "displayed" form of an address be identical to
the way you spell it when you type it in.  I need to be able to give
the address to a friend on paper or a business card.  What happens if
his system doesn't support the same character set mine does?

1342 style encodings aren't appropriate because the local part of an
address needs to be (a) unique and (b) opaque.


A third requirement is that addresses should be
(c) as short as possible.


Agreed.  My current list (not in any particular order):

(a) Uniqueness - There should not be several different ways to spell an
electronic mail address.  Likewise, a "mapped" address must identify at
most one mailbox.  Ideally, two addresses can be easily compared to see
if they identify the same mailbox.

(b) Opacity - The format of the local-part of an address is specific to
the mail domain.  Mail handling software should avoid making
assumptions about this format or applying any transformations to it.  A
user name encoding scheme should not change this rule.  (Note that this
requirement already exists: For example, only the domain named on the
right side of the '@' sign may interpret a '%' that appears in the
local-part. [RFC 822, section 6.3; RFC 1123, section 5.2.16])

(c) Restricted character set - The mapped address should fit within a
sufficiently small character set that it need not be encoded again, for
example, using the techniques defined in RFC 1137.  Furthermore, it
should survive translation into the addresses used by other message
handling systems such as X.400(84).  (Basically this restricts to the
PrintableString character set)

(d) Terse - Email addresses should be easy to type without errors.

(e) Obviousness - Ideally, the mapping to ASCII should be "obvious",
i.e., easily guessable by a human who knows the recipient's user name.

It's fairly easy to see that no mapping of any large character set to
ASCII will meet all of these requirements, but it might be possible for
some character sets and languages, (say by translating "ö" to "#oe#").

(As for your proposal, see whether you agree with the above
"requirements" and if so, how well your proposal fits...)

I was going to suggest something like an extension of RFC 1137 (since
it already exists and is already used for local-part encoding) but the
more I looked at it the more I decided that the "obviousness"
requirement is pretty compelling.  Any proposal that severely penalizes
someone who uses his correctly-spelled name as a login id (say, by
encoding it in base64) will not meet wide acceptance...we want to make
his life more pleasant, not less.

Keith Moore