perl-unicode

Re: Performance and interface of Encode(3pm) in perl 5.8.0-RC1

2002-07-11 07:30:07
Hi,

On Wed, Jul 10, 2002 at 11:44:47PM +0300, Jarkko Hietaniemi wrote:
There has been some talk of Encode possibly caching internally
"fast paths" for small (like one eight-bit cset to another)
conversions.  But it was decided that we better get it first
working (and out of the door with 5.8.0) and only then try to
make it faster.

In my pure Perl implementation the "fast paths" are trivial to
implement and save a lot of time (looking up UTF-8 codes in an
array is a lot faster for 8-bit values, then calculating them
in Perl).  For a C implementation I agree with you, calculating
the fastest path can be more expensive than a brute force approach,
and the potential gain is relatively small.

I won't comment more on the OO vs non discussion except to note
that method dispatch is slow in Perl, so that would tug the boat
the other way...

I'm perfectly happy with a procedural interface, I'm not OO addict. ;-)

My point was: The information how to convert a stream from one
encoding to another is valuable and potentially expensive to retrieve.
The common trilogy for resource usage goes open, use*, close and 
not open, use, forget, open, use, forget, ...

See the attachment MyEncode.pm for my basic idea.  The function
recode() does exactly the same thing as from_to(), only three
times faster in that particular case.

A potential problem with my approach are stateful conversions.  I haven't
checked how the encoding modules behave when they are re-used (after
a possible failure).

Ciao

Guido
-- 
Imperia AG, Development
Leyboldstr. 10 - D-50354 Hürth - http://www.imperia.de/

Attachment: MyEncode.pm
Description: Text document

Attachment: pgpDnSS4a6gb6.pgp
Description: PGP signature