I've been working on the other end of it, which is the conversion to and from
other character sets - basically, the plan is to derive the data from the
There's also RFC 1345, and
http://anubis.dkuug.dk/cultreg/registrations/chreg.htm
The legacy 8-bit stuff is trivial, and when my copy of CJKV Information
Processing arrives, (Hi Jon!) I'll be able to finish off the Unihan business.
Oh, and from_to gets fun if you try and do it by smart-combining the mappings
instead of going through an intermediary character set. :)
Let' do the dumb stuff first. Hmmm, a LRU cache of smart-mapping tables...?
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen