perl-unicode

Re: [not-yet-a-PATCH] compress Encode better

2002-12-21 17:30:04
On Mon, Nov 04, 2002 at 03:26:16AM +0000, hv(_at_)crypt(_dot_)org wrote:
Nicholas Clark <nick(_at_)unfortu(_dot_)net> wrote:
:I've been experimenting with how enc2xs builds the C tables that turn into 
the
:shared objects. enc2xs is building tables (arrays of struct encpage_t) which
:in turn have pointers to blocks of bytes.

Great, you seem to be getting some excellent results.

I have also wondered whether the .ucm files are needed after these
have been built; if not, we should consider supplying with perl only
the optimised table data if that could give us a space saving in the
distribution - it would cut build time significantly as well as
allowing us to consider algorithms that take much longer over the
table optimisation, since they need be run only once when we
integrate updated .ucm files.

Hmm, I wonder how distributable an optimal algorithm could be, and
how many SETI-hours it would take to run? :)

Well, the brute force search could take a little while:

perl5.8.0 ../bin/enc2xs -B -Q -O -o experiment.c -f symbol_t.fnm 
Reading AdobeSymbol (AdobeSymbol)
Reading AdobeZdingbat (AdobeZdingbat)
Reading dingbats (dingbats)
Reading MacDingbats (MacDingbats)
Reading MacSymbol (MacSymbol)
Reading symbol (symbol)
Writing compiled form
Preparing for brute force search at Sat Dec 21 23:42:34 2002
There are 167 strings, 1.5e+300 permutations to try, target to beat is 1762
Total length is 1762
Starting brute force search at Sat Dec 21 23:42:34 2002
Depth 152 try 152 
'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111111'
 length already 1762, best is 1762, so pruning, at Sat Dec 21 23:42:34 2002

That looks to be one of the faster ones. Most of the rest give things like this:
There are 1263 strings, Inf permutations to try, target to beat is 12764

That string of 0s and 1s is part of the state record, mostly for debugging.

I think I need some of Damian's parallel Universes. Else I'm going to wear
this one out. The brute force search can quickly get to the current
(non -O) algorithm for small cases, but not for the current -O  algorithm.
So I'm nowhere near beating it. I need better cheats. Er shortcuts.

Nicholas Clark
-- 
Brainfuck better than perl?     http://www.perl.org/advocacy/spoofathon/

<Prev in Thread] Current Thread [Next in Thread>