Charles Lindsey <chl(_at_)clw(_dot_)cs(_dot_)man(_dot_)ac(_dot_)uk> writes:
Keith Moore <moore(_at_)cs(_dot_)utk(_dot_)edu> writes:
which is why you can't expect all Unicode to be normalized, because it
may have been translated from another charset.
Rubbish. Read what I wrote.
I was talking of code C -> *normalized* Unicode -> code C.
So the Unicode in question was normalized by definition. And the
transformation from C to normalized unicode is NOT invertible (in
general).
Which, as Keith said, is why people don't do that transformation and
instead transform code C into unnormalized Unicode so that they can later
translate back into code C if they want to.
--
Russ Allbery (rra(_at_)stanford(_dot_)edu)
<http://www.eyrie.org/~eagle/>