RE: Last Call: draft-klensin-unicode-escapes (ASCII Escaping ofUnicode C

From: Stephane Bortzmeyer [mailto:bortzmeyer(_at_)nic(_dot_)fr]
Sent: Monday, October 22, 2007 4:03 AM

Also, "a further encoding of the encoding form" isn't going to be
clear to readers.


It is a reference to a bad practice (used in URLs, for instance) to
encode twice (for instance in UTF-8, then in %xx escapes of the
bytes).


The discussion in that section is about references to characters in general 
human-readable content, not in URLs. If that is what the wording is referring 
to, it's extremely opaque. If that's really what the authors intend to talk 
about, it should be explained -- and the section should be organized better so 
that it makes sense why that particular thing is being discussed.

  "However, when information about characters is to be processed by
  people, reference to the Unicode code point is preferable to
  encoded representations of the code point."


That's not more clear to me.


How can it not be clear? Human-readable content is discussing a Unicode 
character and needs to refer to the character in some way. The whole point of 
this document is about how to refer. Since Unicode character identity is 
established by the name, the code point and the reference glyph, reference can 
be made using one of those three things. It appears to me that this document 
focuses on references based in some way on the code point: is not the key 
distinction between the code point itself and some encoded representation of 
the code point?



Peter Constable

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

RE: Last Call: draft-klensin-unicode-escapes (ASCII Escaping ofUnicode Characters) to BCP