perl-unicode

Re: Overlong UTF-8 (Re: Make Encode.pm support the real UTF-8)

2004-12-03 05:30:05
Gisle Aas <gisle(_at_)ActiveState(_dot_)com> writes:

bash-2.05b$ cat xxx.pl
if (@ARGV) {
    print "Hi\n";
    if ($ARGV[0] eq "encoding") {
        binmode(STDIN, ':encoding(utf8)');
    }
    elsif ($ARGV[0] eq "utf8") {
        binmode(STDIN, ':utf8');
    }

    my $data = <STDIN>;

    use Data::Dumper;
    print Dumper($data);
}
else {
    print "foo\xf0\x80\x80\x80bar\n";
}
bash-2.05b$ perl xxx.pl | perl xxx.pl raw
Hi
$VAR1 = 'fooðbar
';
bash-2.05b$ perl xxx.pl | perl xxx.pl encoding
Hi
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
$VAR1 = 'foo';

What is also interesting is that perl-5.8.5 behaves much more "sane"
in this case.

$ perl xxx.pl | perl xxx.pl encoding
Hi
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
$VAR1 = 'foo\\xF0\\x80\\x80\\x80bar
';

Was this an intentional change?