Dan Kogai wrote:
use strict;
use warnings;
$\ = "\n";
use encoding "utf8";
my $e = chr(0xE3).chr(0x81).chr(0x82);
print $e =~ /^\x{3042}$/ ? 'true' : 'false';
print chr(0xE3).chr(0x81).chr(0x82) =~ /^\x{3042}$/ ? 'true' : 'false';
__END__
This prints "false" for the first but "true" for the next one. U+3042
(HIRAGANA LETTER A) in UTF-8 is \xE3\x81\x82 so bytewise they may match
but the UTF8 flag for chr(0xE3).chr(0x81).chr(0x82) is off so it should
not match (regardless of use (utf8|bytes). So the first one is okay
but the second one is not.
Question :
I don't understand why chr(0xE3).chr(0x81).chr(0x82) should be
treated differently from "\xe3\x81\x82" (knowing that constant folding
happens at compile-time on concatenation of constant strings.)
--
Untried is not *NIX