perl-unicode

[Encode] 1.20 released!

2002-04-04 13:23:07
jhi and Perl Porters,

I am relieved to announce that I have uploaded Encode-1.20, the final release before 5.8.0-RC1. It is available at

http://www.dan.co.jp/~dankogai/Encode-1.20.tar.gz

as well as CPAN.

=head1 Summery of Changes

=head2 Fire Extinguishers

The following problem has been fixed. Some already in perl-current, some not.

* SvGROW bug found by Andreas
* 8.3 filename bug for _def.h (now .exh) that caused djgpp to smoke
* encoding.t and jperl.t should skip testing without PerlIO, which did not,
  smoking on Solaris, et al.
  Here is how it behaves now.

ext/Encode/t/Aliases................ok
ext/Encode/t/CN.....................skipping test on this platform
ext/Encode/t/Encode.................ok
ext/Encode/t/JP.....................skipping test on this platform
ext/Encode/t/KR.....................skipping test on this platform
ext/Encode/t/TW.....................skipping test on this platform
ext/Encode/t/encoding...............skipping test on this platform
ext/Encode/t/jperl..................skipping test on this platform

* dirty @INC fiddling cleaned up in bin/enc2xs. */Makefile.PL needed changes.

=head2 Enhancements

* Now all mapping files under http://www.unicode.org/Public/MAPPINGS/ are converted (except, of course, such files that are not mappings as sgml.txt) to UCM. All except for Mac Indic Encodings are supported. So far Indics support is beyond the capability of encengine.c.

To partially support combined characters, A new feature is added to UCM to allow Unicode Combined Character to be expressed. Here is an excerpt from macJapanese.ucm

<U2010><UF87E> \xEB\x5D |3 # vertical form for HYPHEN

As you see, it is marked as |3. It decodes to utf8 but you cannot encode it back. to make it really round trip, either encengine must be enhanced and/or a new module with a different algorithm must be used. For most cases, this should be good enough.

* a new script called unidump is added. It is still not ripe so it just sits in the tarball. But If you got time please play with it. I would like to know what you guys think.

=head1  TODO

* Indics support. I have found it theoretically possible to implement a roundtrip support within the limitation of encengine, at least for Mac Indics. But without testers and time it was tabled for the time being.

* There are even more wild ideas in my mind but for the time being give the pumpking a break. Thank you so much for your patience. KIITOS!

Dan the Encode Maintainer

1.20  $Date: 2002/04/04 19:50:52 $
+ bin/unidump
  the last minute addtion.  Just give it a try.  Docs remains to be done.
  Not installed by default.
! lib/Encode/Supported.pod
  Enhanced Greatly.
! t/Alias.t
! lib/Encode/Alias.pm
! lib/Encode/utf8.pm
! lib/Encode/10464_1.pm
! lib/Encode/ucs2_le.pm
  Canonical name for 'UCS-2le" is now "UTF-16LE".  UCS-2 left
  unchanged but UTF-16BE is added as an alias.  Implicit aliases
  move to Encode::Alias so init_alias() works more as expected.
  Also, 'utf8' is now canonical with 'UTF-8' being an alias.
  Though pedantically wrong, This should make perl mongers happier.
  t/Alias.t is enhanced to test all these.
  Message-Id: <9C39BD58-47AF-11D6-9D82-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
! Byte/Makefile.PL
  Now all .ucm are stacked in byte_t; They all share ascii part so 50%
  of the codepoints are common.  CJKT left as is because the saving is
  not significant.
! Byte/Makefile.PL
! CN/Makefile.PL
! EBCDIC/Makefile.PL
! Encode.xs
! Encode/Makefile_PL.e2x
! JP/Makefile.PL
! KR/Makefile.PL
! Makefile.PL
! Symbol/Makefile.PL
! TW/Makefile.PL
! bin/enc2xs
! AUTHORS
  All occurance of _def.h replaced with .exh so djgpp works happily
  ever after!  To credit this amazing discovery, Laszlo is now in
  AUTHORS list
  Message-Id: <20020403181424(_dot_)GA8778(_at_)freemail(_dot_)hu>
  Message-Id: <B5BF0C6F-4732-11D6-B13D-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
! Makefile.PL
! */Makefile.PL
! Encode/Makefile_PL.skel
  bin/enc2xs
  No more @INC fiddling!  Uses $ENV{PERL_CORE} instead
  Message-Id: <20020401222744(_dot_)GX2000(_at_)blackrider>, et al.
! t/encoding.t
  Two more tests by added jhi
  Message-Id: <200204020000(_dot_)DAA25121(_at_)alpha(_dot_)hut(_dot_)fi>
+ t/grow.t
! Encode.xs
  The showstopper fixed -- Memory reallocation bug was causing
  Encode::XS to fall into infinite  loop on certain conditions.
  t/grow.t tests that.
  Message-Id: 
<9572CAC4-463C-11D6-ABA5-00039301D480(_at_)dan(_dot_)co(_dot_)jp>, et al
+ bin/txt2ucm
! */Makefile.PL
! */*.ucm
! */XX.pm
! lib/Encode/Supported.pod
  Vendor encodings rebuilt out of original map files at unicode.org.
  Indic languages such as MacDevanagali remain unspported do to the
  shortcoming of encengine capabilities (they need algorithmical
  conversion and I have no knowledge on that!).  Pods fixed for added
  encodings.
  Oh, macJapan.ucm renamed to macJapanese.ucm.
  macROMnn is macRomanian and macRUMnn is macRumanian.
  txt2ucm is a crude script that is used to convert them.
! bin/enc2xs
  Unicode Compound Characters (used extensively on Mac) supported
! bin/piconv
  Typo fixes and improvements by jhi
  Message-Id: <200204010201(_dot_)FAA03564(_at_)alpha(_dot_)hut(_dot_)fi>, et 
al.

<Prev in Thread] Current Thread [Next in Thread>