perl-unicode

Alert: Module Layout Updated

2002-03-21 15:50:45
Encode Hackers,

Here are changes since 0.95. I decided to let ISO-8859 and other single-byte encodings demand-load like Encode::XX.

+ Byte/Byte.pm
+ Byte/Makefile.PL
+ EBCDIC/EBCDIC.pm
+ EBCDIC/Makefile.PL
+ Symbol/Makefile.PL
+ Symbol/Symbol.pm
! Encode.pm
! Encode.xs
  Latin and single byte encodings are reorganized so they are
  demand-loaded like Encode::XX.  Now only ascii is compiled into
  Encode itself.
! lib/Encode/Alias.pm
  for my $k (keys %hash){ delete $hash{$k}; }
   is depreciated; fixed.

  And here is the outcome.

Then...
> perl5.7.3 -MEncode=encodings -e 'print join(",", sort(encodings())),"\n"'
UCS-2,UCS-2le,US-ascii,UTF-8,cp1250,cp1251,cp1252,cp1253,cp1254,cp1255,
cp1256,cp1257,cp1258,iso-8859-1,iso-8859-10,iso-8859-11,iso-8859-13,
iso-8859-14,iso-8859-15,iso-8859-16,iso-8859-2,iso-8859-3,iso-8859-4,
iso-8859-5,iso-8859-6,iso-8859-7,iso-8859-8,iso-8859-9,koi8-r,
maccenteuro,maccroatian,maccyrillic,macdingbats,macgreek,maciceland,
macroman,macrumanian,macsami,macthai,macturkish,macukraine,viscii

And now...
> perl5.7.3 -Mblib -MEncode=encodings -e \
  'print join(",",  sort(encodings())),"\n"'
UCS-2,UCS-2le,US-ascii,UTF-8

  Yet there is no problem if you go like this;

> perl5.7.3 -MEncode -e 'print encode("iso-8859-1", "There!"),"\n";" \
  -e "print join(",", sort(Encode::encodings())), "\n"'
There!
UCS-2,UCS-2le,US-ascii,UTF-8,cp1250,cp1251,cp1252,cp1253,cp1254,cp1255,
cp1256,cp1257,cp1258,iso-8859-1,iso-8859-10,iso-8859-11,iso-8859-13,
iso-8859-14,iso-8859-15,iso-8859-16,iso-8859-2,iso-8859-3,iso-8859-4,
iso-8859-5,iso-8859-6,iso-8859-7,iso-8859-8,iso-8859-9,koi8-r,
maccenteuro,maccroatian,maccyrillic,macdingbats,macgreek,maciceland,
macroman,macrumanian,macsami,macthai,macturkish,macukraine,viscii

In terms of functionalities, nothing has changed. The CJK users may save 200KB or so when they don't need ISO-8859 series. But the best thing about it is it that top directory is much cleaner now.

I'm not sure whether vendor encodings (cp* and mac*) should go to different namespaces all well but in term of sizes I think the current situation is okay.

> ls -F `find blib/arch/auto/Encode  -name \*.so` # on FreeBSD 4.5-STABLE
 168 blib/arch/auto/Encode/Byte/Byte.so*
1608 blib/arch/auto/Encode/CN/CN.so*
  19 blib/arch/auto/Encode/EBCDIC/EBCDIC.so*
  27 blib/arch/auto/Encode/Encode.so*
1800 blib/arch/auto/Encode/JP/JP.so*
1312 blib/arch/auto/Encode/KR/KR.so*
  21 blib/arch/auto/Encode/Symbol/Symbol.so*
1296 blib/arch/auto/Encode/TW/TW.so*

Another thing I am not sure is whether you like the name of Encode::Byte or not. If you don't like it, tell me and I'll rename it if your name looks greater....

Dan the Yet Another Encode Hacker