For Chinese usage, following 7 encodings are not here yet, but we can
also add them if desired:
- 'hz' and 'iso-2022-cn', two different encoding tables for gb2312
described above.
This isn't there? I remember seeing HZ.enc?
- 'gb18030', used in glibc2.2, is a superset of gbk, which is a super
set of gb2312; we should use that instead of 'gbk' if we want gbk
support.
- 'iso-ir-165', a different extension to gb2312, adding gb6345 and
gb8565 support. Not in wide use.
- 'iso-2022-cn-ext', the iso-2022'ized version of all characters in
gb(2312|12345|7589|7590), iso-ir-165, and cns-11643-*. it's a sort
of 'unified chinese code'.
- 'big5p', the Big5+ Traditional Chinese encoding, is similarily a
superset of 'big5', which provides a more complete unicode mapping,
which covers most of Taiwan's uses.
- 'big5-hkscs', a different extension to big5, adding characters used
is Hong Kong, incompatible with big5p.
Gnu libiconv has most of the above mappings other than big5p; I'm willing
to supply their maps if it's ok with the list.
Okay-- as long they aren't huge (several megabytes). If they are, we
may have to consider a separate CPAN package.
/Autrijus/
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen