perl-unicode

[Encode] Encode::Perl?

2002-03-25 03:36:20
On Monday, March 25, 2002, at 06:59 , Nick Ing-Simmons wrote:
It should not be too hard to take the .ucm file parsing from 'compile'
and teach Encode::Tcl-like all-perl code to read .ucm-s.
We can then rename it Encode::Perl ;-)

I am considering that kind of option but I am not sure if it should go to the perl dist. Thanks to your compile script, Encode is now smart enough to handle most of the major encodings without a help of Encode::Tcl (ISO-2022 types are so far indivisually handled by perl modules, such as Encode::JP::JIS). We can go even wilder. I am thinking of developing something like Unicode::DataBase to implement full support for ISO-2022-(INT|JP-2). The current problem to implement ISO-2022 is encoding; You have to to what character set a given (Unicode) character maps to but thanks to the character unification rule, this is impossible just by looking at the character. The solution is to have a database and lookup each character to find what character sets have corresponding codepoints, then pick one up by a given precedence (for instance, you go like "try JIS X 0208, then GB 2312, then KSC 5601 for ISO-2022-JP-2). But we need a database to begin with....

Dan the Man with Too Many Encodings to Support