Gisle Aas <gisle(_at_)ActiveState(_dot_)com> writes:
bash-2.05b$ cat xxx.pl
if (@ARGV) {
print "Hi\n";
if ($ARGV[0] eq "encoding") {
binmode(STDIN, ':encoding(utf8)');
}
elsif ($ARGV[0] eq "utf8") {
binmode(STDIN, ':utf8');
}
my $data = <STDIN>;
use Data::Dumper;
print Dumper($data);
}
else {
print "foo\xf0\x80\x80\x80bar\n";
}
bash-2.05b$ perl xxx.pl | perl xxx.pl raw
Hi
$VAR1 = 'fooðbar
';
bash-2.05b$ perl xxx.pl | perl xxx.pl encoding
Hi
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
$VAR1 = 'foo';
What is also interesting is that perl-5.8.5 behaves much more "sane"
in this case.
$ perl xxx.pl | perl xxx.pl encoding
Hi
utf8 "\xF0" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
utf8 "\x80" does not map to Unicode at xxx.pl line 10.
$VAR1 = 'foo\\xF0\\x80\\x80\\x80bar
';
Was this an intentional change?