ietf-822
[Top] [All Lists]

Re: IDN (was Did anyone tell Microsoft yet?)

2002-05-01 09:13:04

In <200204302032(_dot_)18912(_at_)sendmail(_dot_)mutz(_dot_)com> Marc Mutz 
<mutz(_at_)kde(_dot_)org> writes:


To: Jürgen Schmidt <jürgen(_at_)tu-münchen(_dot_)de>

It might be interesting to know how existing mailers handle this. Typing the 
above into the TO: fields of various mailers gives:

<Interesting examples snipped>

Dtmail (as supplies with Solaris 7)

To: =?ISO-8859-1?Q?J=FCrgen?= Schmidt
        <=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>

But, to my surprise, it then sent it to localhost via SMTP, and localhost
invoked sendmail, which accepted it without demur and queued it for later
delivery. Here are some interesting extracts from the queue:

RPFD:<=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>
HReceived: from localhost (localhost [127.0.0.1])
        by clw.cs.man.ac.uk (8.9.1b+Sun/8.9.1) with SMTP id KAA16419
        for 
<=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>;
        Wed, 1 May 2002 10:08:29 +0100 (BST)
HX-Mailer: dtmail 1.3.0 CDE Version 1.3 SunOS 5.7 sun4m sparc

I am curious as to what bounces I shall get when the queue is run. I shall
report is there is anything interesting.

OK, the queue was run as I typed this, and my sendmail reported

501 <=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>: 
domain
missing or malformed

which seems fair enough. I am just surprised that it got as far as it did.

From this, one might conclude that simply extending rfc2047 might be the 
easiest upgrade path ;-)

It might well be a useful notation for the interface between the User
Agent and the MTA (but not on any wire).

I think that the local-part is the smallest of the problems. local-parts are 
uninterpreted and can thus carry any encoding that would be a valid 
local-part in rfc2822, e.g. rfc2047. The burden to make sense out of this is 
on the receiving end: If someone allows mailboxes with non-US-ASCII 
characters on her machine, then she needs to make sure that she installs 
tools that can handle the new encoding. For everyone else, the local-part 
just looks funny.

The main problem is therefore the domain part. I must admit that I haven't yet 
read the IDN/punycode draft(s?), but what is the problem with using that for 
the domain?

The chief problem with using IDNA in any User Agent is the Nameprep
operation (see my message to this list a couple of days ago). It requires
the presence of large tables of Unicode transformations and forbidden
characters. These tables may well change over time as Unicode evolves. I
just don't trust implementors of User Agents to get it right (and
especially I don't trust Bill Gates to get it right). There is a much
better chance of getting it right if it is done in the transport agents.
For example, if sendmail was REQUIRED by some standard to translate any
name-addr containing RFC 2047 encodings into IDNA, then my message to
Jürgen above might well get delivered.

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 
Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

<Prev in Thread] Current Thread [Next in Thread>