perl-unicode

Re: Some Encode::TW test results.

2002-02-26 07:11:25
Autrijus Tang <autrijus(_at_)autrijus(_dot_)org> writes:

GB2312(CN) is absolutely broken; it rejects any valid GB input I could muster
(including EUC-CN, HZ and GBK encodings); I suppose the original Tcl map
is broken as well, since it lists itself as a type D (double-byte) mapping,
but in practice it's almost always a M type encoding, with 0xA1-0xFE as 'lead'
bytes. GB12345 is similarily broken.

Tcl's maps were originally mainly for Tcl/Tk font use - so if GB2312
fonts are normally 16-bit encoded that may be why the broken maps are
present.


I'll see what I can do to regenerate their maps, either from
http://www.unicode.org/Public/MAPPINGS/ or other official sources.

Thanks,
/Autrijus/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6d-cvs (FreeBSD)

iEYEARECAAYFAjx60dwACgkQtLPdNzw1AaCZWgCeJFjjJFtDlMbaSepYqeK6X533
W2gAnRrJld8RB46HPaWggHDPjH00EYLv
=khA/
-----END PGP SIGNATURE-----
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/



<Prev in Thread] Current Thread [Next in Thread>