On Monday, March 25, 2002, at 06:59 , Nick Ing-Simmons wrote:
It should not be too hard to take the .ucm file parsing from 'compile'
and teach Encode::Tcl-like all-perl code to read .ucm-s.
We can then rename it Encode::Perl ;-)
I am considering that kind of option but I am not sure if it should go
to the perl dist. Thanks to your compile script, Encode is now smart
enough to handle most of the major encodings without a help of
Encode::Tcl (ISO-2022 types are so far indivisually handled by perl
modules, such as Encode::JP::JIS).
We can go even wilder. I am thinking of developing something like
Unicode::DataBase to implement full support for ISO-2022-(INT|JP-2).
The current problem to implement ISO-2022 is encoding; You have to to
what character set a given (Unicode) character maps to but thanks
to the character unification rule, this is impossible just by looking at
the character.
The solution is to have a database and lookup each character to find
what character sets have corresponding codepoints, then pick one up by a
given precedence (for instance, you go like "try JIS X 0208, then GB
2312, then KSC 5601 for ISO-2022-JP-2). But we need a database to begin
with....
Dan the Man with Too Many Encodings to Support