[Top] [All Lists]

Re: UTF8 vs. Punycode

2007-08-14 07:56:09

At 11:30 AM +0200 8/14/07, Simon Josefsson wrote:
One risk is that the specification cannot use Unicode code points from a
newer Unicode version than IDNA ToASCII supports, right now that means
Unicode 3.2.

That is not necessarily true. The current version of IDNA supports Unicode version 3.2. A future version of IDNA may support later versions of Unicode.

Since some time, we have Unicode 5.0, which includes many
important code points for a variety of languages.

Not to get into a flame-war here, but I think that "important" is a gross overstatement. There are a few minor scripts and personal name characters that are not included in Unicode 3.2, but there has been essentially no public pressure on the IETF to update them.

Newer versions of
Unicode will be released in the future.  Having to update this
specification for every IDNA/ToUnicode release seems sub-optimal to me.

Fully agree. It is the responsibility of the IETF to make sure that is not needed.

I believe it is better to teach protocols how to deal with non-ASCII
data, rather than relying on IDNA idiosyncrasies in every IETF protocol.

We disagree here, particularly for security protocols.

The choice to remain with ASCII has been made for the DNS protocol,
where it makes some sense due to backwards compatibility reasons, but
that does not mean we have to make the same choice in every IETF

This makes no sense here. The protocol in question is representing email addresses. The right side of an email address is a domain name.

Some IETF protocols can easily negotiate support for UTF-8 on
both sides, and using UTF-8 rather than Punycode seems more robust and
like better engineering to me.

Un-normalized fails miserably when exact matching is needed, such as it is in IBE.

--Paul Hoffman, Director
--VPN Consortium

<Prev in Thread] Current Thread [Next in Thread>