perl-unicode

Re: ucm/cp???.ucm will be updated

2002-10-18 11:30:04
Autrijus and others,

On Friday, Oct 18, 2002, at 22:21 Asia/Tokyo, Dan Kogai wrote:
[2] http://www.microsoft.com/typography/unicode/cscp.htm
[3] http://www.microsoft.com/typography/unicode/932.txt

[snip]

The URI [2] also has links to other code pages so I would also like to review them and if neccessary, update them. 8 bit code pages (CP12??) seem OK but other CJK (CP9??) needs reviews.

So I did to 932 (JP), 936 (CN), 949 (KR), and 950 (TW). The new maps generated via http://www.microsoft.com/typography/unicode/9??.txt all seem to pass roundtrip tests in t/CJKT.t but 936 and 950 fails in t/at-cn.t and t/at-tw.t.

Aiiiya! It was the fault of my ms2ucm.pl that forgot to ignore ";Lead Byte Range" line; that line was mistakenly parsed as a part of mapping. With that fixed, the new mapping looks okay and passes all tests.

Nevertheless, those updated mappings are still subject to reviews. Should you have any objection please say so ASAP. Otherwise I will commit the new *.ucm.

I would like you to review them at (sorry, last 't' was missing)

http://www.dan.co.jp/~dankogai/bleedperl/cp-cjkt/

You can also find my crude script that was used for conversion as

http://www.dan.co.jp/~dankogai/bleedperl/cp-cjkt/ms2ucm.pl

Xie4Xie4Ge3Zuo1 !

Dan the Encode Maintainer

<Prev in Thread] Current Thread [Next in Thread>