Bruce Lilly <blilly(_at_)verizon(_dot_)net> wrote:
The string transmitted via DNS is a 'name' (in the sense used in RFCs
1958 and 2277), and need not (should not) be language-tagged -- it is
not 'text' in the sense used in RFC 2277.
Agreed.
But as the idea behind IDN is to provide a displayable bit of 'text',
i.e. some string of characters in some charset (utf-8) and presumably
in some language and intended for human consumption, it would be
appropriate for the displayable form to include a language tag.
IDNs are not designed to have different semantics from ASCII domain
names ('text' versus 'name'); they are designed to have the same
semantics and merely support a larger character set.
Such a tag should not be separate from the IDN; it should be part of
the IDN so that it travels with and is processed with the text which
is presented to a human (using a client with appropriate IDN support).
One wouldn't want the language tag to be stripped or otherwise mangled
in transit.
But the "transit" for domain names is often the telephone, billboards,
etc. It's hard to see how invisible language tags would survive such
transit.
I see no reason why some text sequence in different languages should
not encode to different DNS names, just as "boot" in German and
"boot" in English refer to two very different things (indeed, there
are differences between en-us and en-uk) -- in fact it seems highly
desirable that they *should* encode to different DNS names.
So josé.com and josé.com should be two different domain names? (One is
Spanish and the other is Portuguese.) If I see josé.com on paper, how
do I know which of those two domains it is? And even if I know, how do
I type the language tag into my browser?
This is why I think language tagging, if used at all, would need to be
non-essential markup, which could be retained when feasible, but could
also be lost with no worse result than a degradation in the quality of
presentation, not a failure to find the domain.
AMC