perl-unicode

Re: [Encode] UCS/UTF mess and Surrogate Handlings

2002-04-05 09:24:59
Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:
>
P.S.  Does utf8 support surrogates?  Surrogate pair is definitely the 
ugliest SOB of Unicode but without it, we can't print 
\x{8000}-\x{10ffffff} to the stream....

UTF-8 does not _need_ to support surogates - it can do full range 
without them.  What you have to beware of is UTF-8 encoding codepoints
in the surogate range, rather than de-surogating and encoding the 
real code point.

The fixed UCS-2BE works for Tk - but is still a little slower than 
it could be. I suggest we do UTF-16XE properly as XS code. 

-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/