perl-unicode

Re: Performance and interface of Encode(3pm) in perl 5.8.0-RC1

2002-07-11 08:30:04
Hi,

On Thu, Jul 11, 2002 at 12:15:30PM +0100, Nick Ing-Simmons wrote:
For my Tk application of encode the in-place form causes unnecessary
copies. e.g. I need the original and the form encoded into the encoding 
required by the font, or I have to copy the input arg to return location.

But whether the caller or the callee makes the copy should make no 
difference in performance.  I personally prefer to make copies as
late as possible.

Doing in-place is very hard to do when converting between two variable 
length encodings. I suspect your "all perl" version is not _really_ 
doing it "in place" but just in same scalar, but in different PV "buffers".

Correct.  But (see your own example below) I could also write
something like

        my $replace = $subchar x 128;
        $_[n] =~ y/\x80-\xff/$replace/;

for many 8bit to ascii encodings and leave the decision whether a
copy of the original is left to the caller.

The Encode API is writen to allow core of encodings to be written in C
Keeping return value and source separate is very useful for C.

However, do you need witch-craft to copy a string buffer in C if the
need for it arises?

I would use Encode that way as well.

  my $enc = find_encoding('cp1250');
  my $string  = decode($enc,$octets); 

That's it. ;-)

Provided that it is safe to call decode() and encode() as many times
as I want, even after an error, that's exactly what I was looking
for.

Thanks!

Guido
-- 
Imperia AG, Development
Leyboldstr. 10 - D-50354 Hürth - http://www.imperia.de/

Attachment: pgprubqtv9i9o.pgp
Description: PGP signature