perl-unicode

chop fails on decoded string with trailing nul

2005-07-13 09:17:05
[Apologies if this is a duplicate posting -- still learning my way
around Gnus]

Hi,

I ran into this, and wondered if it is a bug.

I have tested on perl 5.8.4 with Encode.pm version 1.99_01 (from
Debian package) and 2.10 (from CPAN).

Basically, if I take a string with a trailing nul, encode it (to any
encoding, even "ascii"), decode it, then chop it, chop returns undef
and the string still has the trailing nul.  If the string instead has
a trailing newline (for example), the chop works correctly.

Am I missing something?

Here is sample output from my test code below:

--
@asc (en/de-coded) before chop
$VAR1 = "hello, world!\n";
$VAR2 = "goodbye, cruel world!\0";

@asc2 (untouched) before chop
$VAR1 = "hello, world!\n";
$VAR2 = "goodbye, cruel world!\0";

@asc (en/de-coded) after chop
$VAR1 = "hello, world!";
$VAR2 = "goodbye, cruel world!\0";

@asc2 (untouched) after chop
$VAR1 = "hello, world!";
$VAR2 = "goodbye, cruel world!";
--

And here is my code:

--
#!/usr/bin/perl -w

use strict;
use Encode;
use Data::Dumper;

$Data::Dumper::Useqq = 1;

my @asc = ("hello, world!\n", "goodbye, cruel world!\0");
my @asc2 = @asc;    # copy of untouched strings

my @utf = (encode('UTF-16LE', $asc[0]),
           encode('UTF-16LE', $asc[1]));

@asc = (decode('UTF-16LE', $utf[0]),
        decode('UTF-16LE', $utf[1]));

print "\n\n";
print "\(_at_)asc (en/de-coded) before chop\n", Dumper(@asc), "\n";
print "\(_at_)asc2 (untouched) before chop\n", Dumper(@asc2), "\n";
chop @asc;
chop @asc2;
print "\(_at_)asc (en/de-coded) after chop\n", Dumper(@asc), "\n";
print "\(_at_)asc2 (untouched) after chop\n", Dumper(@asc2), "\n";
print "\n\n";
--

-- 
--------------------------------------------------------------------------
Jonathan Hankins        Homewood City Schools

jhankins(_at_)homewood(_dot_)k12(_dot_)al(_dot_)us
--------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>