perl-unicode

Re: perlunitut - feedback appreciated

2001-11-12 00:08:28
On Sun, 11 Nov 2001 12:57:27 -0800, in perl.unicode you wrote:

ISO Latin-1 characters encoded as 10-FF in single bytes are not Unicode.

Hm? ISO Latin-1 characters from 00 to 7F encoded in single bytes
represent the same Unicode characters as those bytes interpreted as
UTF-8, simply because ASCII is a subset both of Latin-1 and UTF-8. 00 to
7F is that common subset.

There is no Unicode transformation format or other encoding that permits
this. The code point range is actually x000010-x0000FF, and the encodings
are

0000000010000000  0000000011111111 UTF-16 Big Endian

That first number of 0x80, not 0x10. If you meant 0x80 .. 0xFF, then I
agree with you.

Cheers,
Philip

<Prev in Thread] Current Thread [Next in Thread>