--On 2002-04-04 10.31 -0600 "Eric A. Hall" <ehall(_at_)ehsco(_dot_)com> wrote:
For binary fields (ASN.1 and DNS labels) we can use some binary form
(like Punycode without the last "base-32").
Why would you do that?
To get as many characters as possible into the field.
As I said, to use the most efficient encoding possible.
I see two different paths:
A Use one encoding for a specific "thing", and the encoding
have to be what is needed for the protocol which uses
this "thing" and have the least features.
B Use different encodings for the same "thing" in different
protocols, and possibly use the same encoding within the
same protocol for all "things".
IDNA is [A] where the encoding of Unicode is the minimal possible (only LDH
characters) but the same for domain names in all protocols. This minimizes
the leakage (note, I am not saying it goes away).
A different solution is path [B] and if that is chosen, UTF8 is definitly
not the most efficient encoding of Unicode in a binary field which have a
length specifier, like a label in the DNS protocol.
paf