perl-unicode

Re: [big5-*.ucm] please revise if possible

2002-04-20 15:19:19
On Sunday, April 21, 2002, at 02:32 , Autrijus Tang wrote:
Updated maps and test:
http://egb.elixus.org/~autrijus/big5-1.52.tgz

Ucmlint still complains, due to the order issue outlined in the
previous mail.

As you have intelligently found, the order for duplicate map DOES matter; |1 or |3 have to come AFTER |0.

So I wrote a quick and dirty sort program that just does this as follows;

#!
use strict;
my @lines;
while (<>){
    chomp;
    m/^<U/o or next;
    push @lines,[ split ];
}

for (sort {
    $a->[0] cmp $b->[0] # Unicode descending order
        or $a->[2] cmp $b->[2] # fallback descending order
            or $a->[1] cmp $b->[1] # Encoding descending order
        }
              @lines) {
    print join(" " => @$_), "\n";
}
__END__

And put the sorted text back to ucm files and now they all round-trip. This is easy to understand how enc2xs works. It has two hashes, %e2u and %u2e. If |0, it updates both. if |1 or |3, it updates either. And when update, old hash entry is overwritten. so |3 goes in vain if it is followed by |0.

Maybe I should document this on enc2xs pod.

XinKeLe!

Dan the Encode Maintainer

> perl5.7.3 -Mblib bin/ucmlint -e ucm/big5-eten.ucm ucm/big5-eten.ucm:warning in line 421: dupe encode map: U2550 => F9,F9 and A2,A4 ucm/big5-eten.ucm:warning in line 436: dupe encode map: U255E => F9,E9 and A2,A5 ucm/big5-eten.ucm:warning in line 440: dupe encode map: U2561 => F9,EB and A2,A7 ucm/big5-eten.ucm:warning in line 450: dupe encode map: U256A => F9,EA and A2,A6 ucm/big5-eten.ucm:warning in line 454: dupe encode map: U256D => A2,7E and F9,FA ucm/big5-eten.ucm:warning in line 456: dupe encode map: U256E => A2,A1 and F9,FB ucm/big5-eten.ucm:warning in line 458: dupe encode map: U256F => A2,A3 and F9,FD ucm/big5-eten.ucm:warning in line 460: dupe encode map: U2570 => A2,A2 and F9,FC
ucm/big5-eten.ucm: no error found
dankogai(_at_)dan-attic[6276]:~/work/Encode> perl5.7.3 -Mblib bin/ucmlint -e ucm/big5-hkscs.ucm ucm/big5-hkscs.ucm:warning in line 1900: dupe encode map: U301E => A1,AA and C6,DE ucm/big5-hkscs.ucm:warning in line 2710: dupe encode map: U4EDD => C9,69 and C6,DF ucm/big5-hkscs.ucm:warning in line 2932: dupe encode map: U50ED => B9,B0 and 9F,CB ucm/big5-hkscs.ucm:warning in line 2981: dupe encode map: U5159 => A2,59 and 92,AF ucm/big5-hkscs.ucm:warning in line 2983: dupe encode map: U515B => A2,5A and 92,B0 ucm/big5-hkscs.ucm:warning in line 2986: dupe encode map: U515D => A2,5C and 92,B1 ucm/big5-hkscs.ucm:warning in line 2988: dupe encode map: U515E => A2,5B and 92,B2 ucm/big5-hkscs.ucm:warning in line 4137: dupe encode map: U5C10 => C9,5C and 9C,BC ucm/big5-hkscs.ucm:warning in line 4384: dupe encode map: U5F0C => 93,61 and 9F,D8 ucm/big5-hkscs.ucm:warning in line 4509: dupe encode map: U6062 => AB,EC and 9E,A9 ucm/big5-hkscs.ucm:warning in line 4765: dupe encode map: U62CE => A9,F0 and A0,77 ucm/big5-hkscs.ucm:warning in line 4767: dupe encode map: U62D0 => A9,E4 and 9D,C4 ucm/big5-hkscs.ucm:warning in line 5935: dupe encode map: U6FB6 => BF,47 and 9B,F6 ucm/big5-hkscs.ucm:warning in line 5974: dupe encode map: U701E => 96,EE and 96,ED ucm/big5-hkscs.ucm:warning in line 6119: dupe encode map: U71DF => C0,E7 and 9C,62 ucm/big5-hkscs.ucm:warning in line 6165: dupe encode map: U7250 => 94,55 and A0,E4 ucm/big5-hkscs.ucm:warning in line 6337: dupe encode map: U7468 => 94,7A and A0,D5 ucm/big5-hkscs.ucm:warning in line 6659: dupe encode map: U77D7 => C5,F7 and 9B,78 ucm/big5-hkscs.ucm:warning in line 6825: dupe encode map: U79E3 => AF,B0 and 9C,BD ucm/big5-hkscs.ucm:warning in line 6958: dupe encode map: U7B51 => B5,AE and 9D,5A ucm/big5-hkscs.ucm:warning in line 6990: dupe encode map: U7BB8 => BA,E6 and 8E,69 ucm/big5-hkscs.ucm:warning in line 7084: dupe encode map: U7CCE => A2,61 and 8E,7E ucm/big5-hkscs.ucm:warning in line 7195: dupe encode map: U7DD2 => BA,FC and 8E,AB ucm/big5-hkscs.ucm:warning in line 7227: dupe encode map: U7E1D => BF,A6 and 8E,B4 ucm/big5-hkscs.ucm:warning in line 7368: dupe encode map: U8005 => AA,CC and 8E,CD ucm/big5-hkscs.ucm:warning in line 7387: dupe encode map: U8028 => BF,AE and 8E,D0 ucm/big5-hkscs.ucm:warning in line 7736: dupe encode map: U83C1 => B5,D7 and 8F,57 ucm/big5-hkscs.ucm:warning in line 7839: dupe encode map: U8503 => 92,42 and 92,44 ucm/big5-hkscs.ucm:warning in line 8047: dupe encode map: U880F => 8F,B6 and A0,63 ucm/big5-hkscs.ucm:warning in line 8181: dupe encode map: U89A6 => BF,CC and 8F,CB ucm/big5-hkscs.ucm:warning in line 8184: dupe encode map: U89A9 => A0,D4 and 8F,CC ucm/big5-hkscs.ucm:warning in line 8494: dupe encode map: U8D77 => B0,5F and 8F,FE ucm/big5-hkscs.ucm:warning in line 8825: dupe encode map: U90FD => B3,A3 and 90,6D ucm/big5-hkscs.ucm:warning in line 9045: dupe encode map: U936E => A0,5F and 92,C8 ucm/big5-hkscs.ucm:warning in line 9335: dupe encode map: U975C => C0,52 and 90,DC ucm/big5-hkscs.ucm:warning in line 9337: dupe encode map: U975D => 9C,E4 and 96,44 ucm/big5-hkscs.ucm:warning in line 9392: dupe encode map: U97FF => C5,54 and 90,F1 ucm/big5-hkscs.ucm:warning in line 9858: dupe encode map: U9F17 => 91,BE and 9F,66
ucm/big5-hkscs.ucm: no error found

<Prev in Thread] Current Thread [Next in Thread>