perl-unicode

Re: 5.8 roadmap and Encode

2002-02-28 09:21:33
For Chinese usage, following 7 encodings are not here yet, but we can
also add them if desired:

  - 'hz' and 'iso-2022-cn', two different encoding tables for gb2312
    described above.

This isn't there?  I remember seeing HZ.enc?

  - 'gb18030', used in glibc2.2, is a superset of gbk, which is a super
    set of gb2312; we should use that instead of 'gbk' if we want gbk
    support.

  - 'iso-ir-165', a different extension to gb2312, adding gb6345 and
    gb8565 support. Not in wide use.

  - 'iso-2022-cn-ext', the iso-2022'ized version of all characters in
    gb(2312|12345|7589|7590), iso-ir-165, and cns-11643-*. it's a sort
    of 'unified chinese code'.

  - 'big5p', the Big5+ Traditional Chinese encoding, is similarily a
    superset of 'big5', which provides a more complete unicode mapping,
    which covers most of Taiwan's uses. 

  - 'big5-hkscs', a different extension to big5, adding characters used
    is Hong Kong, incompatible with big5p.

Gnu libiconv has most of the above mappings other than big5p; I'm willing
to supply their maps if it's ok with the list.

Okay-- as long they aren't huge (several megabytes).  If they are, we
may have to consider a separate CPAN package.

/Autrijus/

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen