ietf-822
[Top] [All Lists]

Re: Dual names, IDN and ASCII, in e-mail addresses?

2003-10-07 20:55:18

Adam M. Costello wrote:

It would be no good to have language tags be part of the identifier;
we wouldn't want the very same string of characters to refer to two
different mailboxes depending on whether it was tagged as Chinese or
Japanese.  Language tags would have to be markup around the identifier,
which would have to be done at a higher layer than IDNA (for example, by
extending the message header syntax).

After some reflection, I believe I disagree.  The string transmitted via
DNS is a 'name' (in the sense used in RFCs 1958 and 2277), and need not
(should not) be language-tagged -- it is not 'text' in the sense used in
RFC 2277.  There is no need for language identification at the level of
DNS; the DNS 'name' is simply an LDH tag that serves as the key for a
database lookup. I.e. there should *not* be language-tagging markup
around the LDH 'name', since natural language is irrelevant to the DNS
protocol.

But as the idea behind IDN is to provide a displayable bit of 'text',
i.e. some string of characters in some charset (utf-8) and presumably in
some language and intended for human consumption, it would be appropriate
for the displayable form to include a language tag. [Of course, there are
portability issues with the Unicode 3.1 language-tagging codes (in
addition to the fact that they are non-standard).]  Such a tag should not
be separate from the IDN; it should be part of the IDN so that it travels
with and is processed with the text which is presented to a human (using a
client with appropriate IDN support). One wouldn't want the language tag to
be stripped or otherwise mangled in transit.

Presumably the holder of a Japanese-language domain name would wish the
displayed text to be identifiable *as* a Japanese name to users of
IDN-capable client software (w/ or w/o text-to-speech capability), and
would almost certainly not wish to tag it as Chinese. Also, if something
like the Unicode 3.1 scheme were used (with suitable changes to nameprep),
the DNS-compatible form for Japanese- and Chinese-tagged text would lead
to different strings of LDH characters, since nameprep+punycode would be
applied to the tagged text, which would differ by virtue of the different
language tags. [but perhaps I misunderstand your meaning of "the very same
string of characters"] Also, I see no reason why some text sequence in
different languages should not encode to different DNS names, just as
"boot" in German and "boot" in English refer to two very different things
(indeed, there are differences between en-us and en-uk) -- in fact it
seems highly desirable that they *should* encode to different DNS names.