perl-unicode

Re: UCM file and combining character sequences

2003-09-21 08:30:05
Hi,

I'm trying to make a UCM file to feed to enc2xs.  The legacy encoding for
Taiwanese romanization *must* have its code points mapped to Unicode
character sequences, for the simple reason that the UCS lacks the
corresponding precomposed characters (and is unlikely to have them in the
future, as they are composable using existing characters from the Latin
script and the Diacritical Combining Marks blocks).  (See [1] for script
details.)
(snip) 
How does enc2xs deal with (or intend to deal with) such a case?  Is the ICU
specification to be followed rigidly?

Since I am very new to Perl, .any insight is appreciated.

[1] http://lomaji.com/poj/chart.html
[2] http://oss.software.ibm.com/icu/userguide/conversion-data.html

--Henry H. Tan-Tenn

Anyway your chart lacks code points for "the legacy encoding".
Is any mapping table you intend available?
(But I wonder why "the legacy encoding" is required,
 although UTF encodings for Unicode are available.)

SADAHIRO Tomoyuki