jhi and Perl Porters,
I am relieved to announce that I have uploaded Encode-1.20, the final
release before 5.8.0-RC1. It is available at
http://www.dan.co.jp/~dankogai/Encode-1.20.tar.gz
as well as CPAN.
=head1 Summery of Changes
=head2 Fire Extinguishers
The following problem has been fixed. Some already in perl-current,
some not.
* SvGROW bug found by Andreas
* 8.3 filename bug for _def.h (now .exh) that caused djgpp to smoke
* encoding.t and jperl.t should skip testing without PerlIO, which did
not,
smoking on Solaris, et al.
Here is how it behaves now.
ext/Encode/t/Aliases................ok
ext/Encode/t/CN.....................skipping test on this platform
ext/Encode/t/Encode.................ok
ext/Encode/t/JP.....................skipping test on this platform
ext/Encode/t/KR.....................skipping test on this platform
ext/Encode/t/TW.....................skipping test on this platform
ext/Encode/t/encoding...............skipping test on this platform
ext/Encode/t/jperl..................skipping test on this platform
* dirty @INC fiddling cleaned up in bin/enc2xs. */Makefile.PL needed
changes.
=head2 Enhancements
* Now all mapping files under http://www.unicode.org/Public/MAPPINGS/
are converted (except, of course, such files that are not mappings as
sgml.txt) to UCM. All except for Mac Indic Encodings are supported. So
far Indics support is beyond the capability of encengine.c.
To partially support combined characters, A new feature is added to UCM
to allow Unicode Combined Character to be expressed. Here is an excerpt
from macJapanese.ucm
<U2010><UF87E> \xEB\x5D |3 # vertical form for HYPHEN
As you see, it is marked as |3. It decodes to utf8 but you cannot
encode it back. to make it really round trip, either encengine must be
enhanced and/or a new module with a different algorithm must be used.
For most cases, this should be good enough.
* a new script called unidump is added. It is still not ripe so it just
sits in the tarball. But If you got time please play with it. I would
like to know what you guys think.
=head1 TODO
* Indics support. I have found it theoretically possible to implement a
roundtrip support within the limitation of encengine, at least for Mac
Indics. But without testers and time it was tabled for the time being.
* There are even more wild ideas in my mind but for the time being give
the pumpking a break. Thank you so much for your patience. KIITOS!
Dan the Encode Maintainer
1.20 $Date: 2002/04/04 19:50:52 $
+ bin/unidump
the last minute addtion. Just give it a try. Docs remains to be done.
Not installed by default.
! lib/Encode/Supported.pod
Enhanced Greatly.
! t/Alias.t
! lib/Encode/Alias.pm
! lib/Encode/utf8.pm
! lib/Encode/10464_1.pm
! lib/Encode/ucs2_le.pm
Canonical name for 'UCS-2le" is now "UTF-16LE". UCS-2 left
unchanged but UTF-16BE is added as an alias. Implicit aliases
move to Encode::Alias so init_alias() works more as expected.
Also, 'utf8' is now canonical with 'UTF-8' being an alias.
Though pedantically wrong, This should make perl mongers happier.
t/Alias.t is enhanced to test all these.
Message-Id: <9C39BD58-47AF-11D6-9D82-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
! Byte/Makefile.PL
Now all .ucm are stacked in byte_t; They all share ascii part so 50%
of the codepoints are common. CJKT left as is because the saving is
not significant.
! Byte/Makefile.PL
! CN/Makefile.PL
! EBCDIC/Makefile.PL
! Encode.xs
! Encode/Makefile_PL.e2x
! JP/Makefile.PL
! KR/Makefile.PL
! Makefile.PL
! Symbol/Makefile.PL
! TW/Makefile.PL
! bin/enc2xs
! AUTHORS
All occurance of _def.h replaced with .exh so djgpp works happily
ever after! To credit this amazing discovery, Laszlo is now in
AUTHORS list
Message-Id: <20020403181424(_dot_)GA8778(_at_)freemail(_dot_)hu>
Message-Id: <B5BF0C6F-4732-11D6-B13D-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
! Makefile.PL
! */Makefile.PL
! Encode/Makefile_PL.skel
bin/enc2xs
No more @INC fiddling! Uses $ENV{PERL_CORE} instead
Message-Id: <20020401222744(_dot_)GX2000(_at_)blackrider>, et al.
! t/encoding.t
Two more tests by added jhi
Message-Id: <200204020000(_dot_)DAA25121(_at_)alpha(_dot_)hut(_dot_)fi>
+ t/grow.t
! Encode.xs
The showstopper fixed -- Memory reallocation bug was causing
Encode::XS to fall into infinite loop on certain conditions.
t/grow.t tests that.
Message-Id:
<9572CAC4-463C-11D6-ABA5-00039301D480(_at_)dan(_dot_)co(_dot_)jp>, et al
+ bin/txt2ucm
! */Makefile.PL
! */*.ucm
! */XX.pm
! lib/Encode/Supported.pod
Vendor encodings rebuilt out of original map files at unicode.org.
Indic languages such as MacDevanagali remain unspported do to the
shortcoming of encengine capabilities (they need algorithmical
conversion and I have no knowledge on that!). Pods fixed for added
encodings.
Oh, macJapan.ucm renamed to macJapanese.ucm.
macROMnn is macRomanian and macRUMnn is macRumanian.
txt2ucm is a crude script that is used to convert them.
! bin/enc2xs
Unicode Compound Characters (used extensively on Mac) supported
! bin/piconv
Typo fixes and improvements by jhi
Message-Id: <200204010201(_dot_)FAA03564(_at_)alpha(_dot_)hut(_dot_)fi>, et
al.