Encode Hackers,
As the title says, Encode-0.99 is now available as
http://www.dan.co.jp/~dankogai/Encode-0.99.tar.gz
or CPAN. Here are Changes;
0.99 Tue Mar 26 2002
- lib/Encode/JP/Const.pm
+ lib/Encode/CJKConstants.pm
+ lib/Encode/CN/2022_CN.pm
+ lib/Encode/KR/2022_KR.pm
+ t/KR.t
+ t/gb2312.euc
+ t/gb2312.ref
+ t/ksc5601.euc
+ t/ksc5601.ref
+ t/table.euc
+ t/table.ref
+ ucm2table
* Support for ISO-2022-KR and ISO-2022-CN added.
* t/KR.t added!
* more t/*.{euc,ref} added, which was autogenerated from ucm2table
* ucm2table autogenerates character table out of UCM files.
- engine.c
+ encengine.c
- lib/Encode/Supports.pod
+ lib/Encode/Supported.pod
Names reverted due to popular demand.
8.3 rule applies only when there is a conflict.
Message-Id: <20020325095924(_dot_)GD44120(_at_)not(_dot_)autrijus(_dot_)org>
! */Makefile.PL
- Encode/*.enc
+ Encode/*.ucm
- lib/Tcl*
- lib/Encode/Format/Enc.pod
- t/Tcl.t
* Character tables is now 100% ucm.
* All files under Encode/ is now 8.3-compliant
* some of missing encodings added (i.e. gsm0338 and nextstep)
* Vendor mappings aggregated with appropriate national std in
Makefile.PL, resulting smaller *.so especially for CJK.
Following is result on Dan's FreeBSD box.
Now Then
---------------------------------------------------------------
blib/arch/auto/Encode/Byte/Byte.so 157,279 171,042
blib/arch/auto/Encode/CN/CN.so 1,634,476 1,626,685
blib/arch/auto/Encode/EBCDIC/EBCDIC.so 18,476 18,476
blib/arch/auto/Encode/Encode.so 27,791 27,791
blib/arch/auto/Encode/JP/JP.so 1,408,056 1,832,811
blib/arch/auto/Encode/KR/KR.so 1,156,518 1,329,587
blib/arch/auto/Encode/Symbol/Symbol.so 23,940 20,990
blib/arch/auto/Encode/TW/TW.so* 948,761 1,316,437
---------------------------------------------------------------
Total 5,375,297 6,343,819
Saving 968,522
* As a result of ucm-transition, Encode::Tcl dropped because
Encode::Tcl demands *.enc.
Encode::Tcl will be supplied in a separate tarball with *.enc.
Message-Id: <C024E294-3FC3-11D6-8347-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
!compile
-encengine.c
+encode.c
!Encode.pm
-lib/Encode/Supported.pod
+lib/Encode/Supports.pod
-lib/Encode/iso10646_1.pm
+lib/Encode/10646_1.pm
-lib/Encode/EncFormat.pod
+lib/Encode/Format/Enc.pod
Files renamed 8.3 filename compliance. Affected modules/scripts
revised.
- lib/Encode/JP/Constants.pm
+ lib/Encode/JP/Consts.pm
! lib/Encode/JP/JIS.pm
! lib/Encode/JP/H2Z.pm
Version nit problem and 8.3 rule fix.
> Package namespace installed latest in CPAN file
> Encode::JP::Constants 0.92 1.02
J/JH/JHI/perl-5.7.3.tar.gz
was noted by jhi then Dan discovers "Constants.pm" does not comply 8.3
rule. Contants.pm renamed to Consts.pm and affected modules are fixed
accordingly. In addition, legacy "use vars qw()..." are replaced with
"our";
Message-Id: <20020325011248(_dot_)D1561(_at_)alpha(_dot_)hut(_dot_)fi>
Message-Id: <41023D51-3FB5-11D6-8347-00039301D480(_at_)dan(_dot_)co(_dot_)jp>
! JP/JP.pm
- lib/Encode/JP/ISO_2022_JP.pm
- lib/Encode/JP/ISO_2022_JP_1.pm
+ lib/Encode/JP/2022_JP.pm
+ lib/Encode/JP/2022_JP1.pm
01234567.012
8.3 naming conflict for vanilla fat addressed by jhi
Message-Id: <20020324201931(_dot_)V22596(_at_)alpha(_dot_)hut(_dot_)fi>
As you see, the biggest difference is that Encode no longer uses
*.enc, or Tcl's encoding table. Instead it uses IBM's ucm format. As a
result;
* Encode::Tcl is detached from the main Encode. Encode::Tcl will be
made available via separate package.
* File size is significantly larger. Now the tarball is over 1 MB.
* In exchange, it compiles faster and the resulting table is smaller.
* But the best thing about ucm is that it is now much easier to
debug/hack
the table! This is nearly impossible on *.enc
* t/KR.t is added at last.
* ISO-2022-(KR|CN) added. But is there any apps that handles this one?
most browsers and mailers only support EUC-KR and EUC-CN (but their
MIME names are mostly not EUC-* but the charset, such as gb2314 and
KS_C_5601. Strange!). I am sure my implementation is correct but I
can't see that for myself....)
Yours,
Dan the Encode Maintainer