perl-unicode

[Encode] HEADS-UP: NC patch will be in

2002-11-04 04:30:04
NC and porters,

First of all, this is a great patch. Not only does it optimize the resulting shlibs, it seems to consume less memory during compilation.

On Monday, Nov 4, 2002, at 12:26 Asia/Tokyo, hv(_at_)crypt(_dot_)org wrote:
Nicholas Clark <nick(_at_)unfortu(_dot_)net> wrote:
:I've been experimenting with how enc2xs builds the C tables that turn into the :shared objects. enc2xs is building tables (arrays of struct encpage_t) which
:in turn have pointers to blocks of bytes.

Great, you seem to be getting some excellent results.

Worked absolutely fine on my PowerBook G4, too.

Before:
  208948 Encode/Byte/Byte.bundle
 1984416 Encode/CN/CN.bundle
   30076 Encode/EBCDIC/EBCDIC.bundle
   33728 Encode/Encode.bundle
 2590420 Encode/JP/JP.bundle
 2208996 Encode/KR/KR.bundle
   39720 Encode/Symbol/Symbol.bundle
 1940288 Encode/TW/TW.bundle
   17892 Encode/Unicode/Unicode.bundle

After:
  178220 Encode/Byte/Byte.bundle
 1085116 Encode/CN/CN.bundle
   25336 Encode/EBCDIC/EBCDIC.bundle
   33604 Encode/Encode.bundle
 1308568 Encode/JP/JP.bundle
 1209804 Encode/KR/KR.bundle
   34896 Encode/Symbol/Symbol.bundle
 1059040 Encode/TW/TW.bundle
   17892 Encode/Unicode/Unicode.bundle

I have also wondered whether the .ucm files are needed after these
have been built; if not, we should consider supplying with perl only
the optimised table data if that could give us a space saving in the
distribution - it would cut build time significantly as well as
allowing us to consider algorithms that take much longer over the
table optimisation, since they need be run only once when we
integrate updated .ucm files.

Trivial yet effective patch is to strip all comments therein. That should dramatically saves space but since *.ucm is, in a way, a source. So I am not sure if I should go for it....

Anyway, I am pretty much for integrating NC patch not just because it reduces shlib sizes but it also appears compiler safer (one of the optimizer features (AGGREGATE_TABLES) was dropped during the dev phase of perl 5.8 for the sake of djgpp and other low memory platforms). Unfortunately I am at my parents' place this week (to finish the book I am writing -- away from kids) so I do not have as much resources for extensive tests (the FreeBSD box I was using here at my parents just died (physically) the day before I came :-( ).

Another concern is that since it changes the internal structure of shlibs CPANized Encode::* modules need to be rebuilt as well, so the released version needs to print a warning on that -- oh wait! Encode.xs remains unchanged so Encode::* may still work....

Thank you, NC.

Dan the Encode Maintainer