Mark,
(Another
jibe, citing the fact that utf-8 is, itself, a modification to "raw" unicode
is probably worth repeating, here.)
MD> When Unicode is expressed as a series of bytes, there are a number of
equally
MD> valid sncoding schemes (aka serializations). UTF-8 is one of those schemes,
and
MD> is no more or less a "modification", and no more or less "Unicode" than any
MD> other of these schemes.
That's right. It is an "encoding". Raw Unicode takes more than 8-bits. Lots
more. UTF-8 is a method of encoding those raw bits into a non-raw form.
So is the ACE approach.
My point was that folks tend to talk about UTF-8 as if it were the raw
representation, rather than a derivative encoding. In fact, UTF-8 is exactly
parallel to the ACE approach.
It might be a more efficient encoding, but it is no more "native" or "direct"
or "raw" than ACE.
d/
--
Dave Crocker <dcrocker-at-brandenburg-dot-com>
Brandenburg InternetWorking <www.brandenburg.com>
Sunnyvale, CA USA <tel:+1.408.246.8253>