From: Stephane Bortzmeyer [mailto:bortzmeyer(_at_)nic(_dot_)fr]
Sent: Monday, October 22, 2007 4:03 AM
Also, "a further encoding of the encoding form" isn't going to be
clear to readers.
It is a reference to a bad practice (used in URLs, for instance) to
encode twice (for instance in UTF-8, then in %xx escapes of the
bytes).
The discussion in that section is about references to characters in general
human-readable content, not in URLs. If that is what the wording is referring
to, it's extremely opaque. If that's really what the authors intend to talk
about, it should be explained -- and the section should be organized better so
that it makes sense why that particular thing is being discussed.
"However, when information about characters is to be processed by
people, reference to the Unicode code point is preferable to
encoded representations of the code point."
That's not more clear to me.
How can it not be clear? Human-readable content is discussing a Unicode
character and needs to refer to the character in some way. The whole point of
this document is about how to refer. Since Unicode character identity is
established by the name, the code point and the reference glyph, reference can
be made using one of those three things. It appears to me that this document
focuses on references based in some way on the code point: is not the key
distinction between the code point itself and some encoded representation of
the code point?
Peter Constable
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf