In <200204302032(_dot_)18912(_at_)sendmail(_dot_)mutz(_dot_)com> Marc Mutz
<mutz(_at_)kde(_dot_)org> writes:
To: Jürgen Schmidt <jürgen(_at_)tu-münchen(_dot_)de>
It might be interesting to know how existing mailers handle this. Typing the
above into the TO: fields of various mailers gives:
<Interesting examples snipped>
Dtmail (as supplies with Solaris 7)
To: =?ISO-8859-1?Q?J=FCrgen?= Schmidt
<=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>
But, to my surprise, it then sent it to localhost via SMTP, and localhost
invoked sendmail, which accepted it without demur and queued it for later
delivery. Here are some interesting extracts from the queue:
RPFD:<=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>
HReceived: from localhost (localhost [127.0.0.1])
by clw.cs.man.ac.uk (8.9.1b+Sun/8.9.1) with SMTP id KAA16419
for
<=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>;
Wed, 1 May 2002 10:08:29 +0100 (BST)
HX-Mailer: dtmail 1.3.0 CDE Version 1.3 SunOS 5.7 sun4m sparc
I am curious as to what bounces I shall get when the queue is run. I shall
report is there is anything interesting.
OK, the queue was run as I typed this, and my sendmail reported
501 <=?ISO-8859-1?Q?j=FCrgen?=(_at_)=?ISO-8859-1?Q?tu-m=FCnchen?=(_dot_)de>:
domain
missing or malformed
which seems fair enough. I am just surprised that it got as far as it did.
From this, one might conclude that simply extending rfc2047 might be the
easiest upgrade path ;-)
It might well be a useful notation for the interface between the User
Agent and the MTA (but not on any wire).
I think that the local-part is the smallest of the problems. local-parts are
uninterpreted and can thus carry any encoding that would be a valid
local-part in rfc2822, e.g. rfc2047. The burden to make sense out of this is
on the receiving end: If someone allows mailboxes with non-US-ASCII
characters on her machine, then she needs to make sure that she installs
tools that can handle the new encoding. For everyone else, the local-part
just looks funny.
The main problem is therefore the domain part. I must admit that I haven't yet
read the IDN/punycode draft(s?), but what is the problem with using that for
the domain?
The chief problem with using IDNA in any User Agent is the Nameprep
operation (see my message to this list a couple of days ago). It requires
the presence of large tables of Unicode transformations and forbidden
characters. These tables may well change over time as Unicode evolves. I
just don't trust implementors of User Agents to get it right (and
especially I don't trust Bill Gates to get it right). There is a much
better chance of getting it right if it is done in the transport agents.
For example, if sendmail was REQUIRED by some standard to translate any
name-addr containing RFC 2047 encodings into IDNA, then my message to
Jürgen above might well get delivered.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133 Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk Snail: 5
Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5