On Wed, 13 Sep 2000, Matt Sergeant wrote:
Until someone extends the Unicode character set beyond the current range,
This has "already" happened. Have a look at
http://www.unicode.org/unicode/alloc/Pipeline.html , the Unicode
allocation pipeline of proposed new characters and scripts. It lists
quite a few scripts beyond U+FFFF, several of which are "Accepted" by the
Unicode Technical Committee (some as much as three years ago) and some in
various stages of the ISO pipeline. While it may take a while before these
become canon (and some may get thrown out along the way), it's not as if
everything after U+FFFF is empty as far as the eye can see, with nothing
on the horizon.
UCS-2 and UTF-16 currently have a one to one mapping. I assume thats the
point being made. An excerpt from the book I'm currently tech reviewing:
Nonetheless Unicode does provide a means of representing code points
beyond 64,535 by recognizing certain two-byte sequences as half of a
surrogate pair. A Unicode document that uses UCS-2 plus surrogate
pairs is said to be in the UTF-16 encoding. Since no software
currently supports or produces surrogate pairs, and since no scripts
I'll grant you that, especially considering the sort of things proposed
for Plane 1 (Deseret Alphabet, Musical Symbols), etc. And, of course, not
many people will be taking advantage of these code points since they're
not finalised.
are encoded in Unicode with code points above 65,535 the
distinction between UCS-2 and UTF-16 is mostly academic at this
point in time.
At this point in time, yes. I suppose I just wanted to point out that this
*may* change, at some unspecified (and maybe even distant) point in the
future.
Cheers,
Philip
--
Philip Newton <newton(_at_)newton(_dot_)digitalspace(_dot_)net>