Sean M. Burke wrote:
At 02:37 PM 2003-06-06 +0000, Richard Evans wrote:
Having talked to a few folks about it, the answer was actually dead
simple:
BEGIN
{
return unless $] >= 5.006;
require utf8; "utf8"->import;
}
I've tested it with perl 5.8, 5.6 and a 5.5 nsperl (can't remember the
exact release off hand).
Hm. But the utf8->import has no effects outside of that block. To wit:
C> \perl628\bin\perl -v
This is perl, v5.6.1 built for MSWin32-x86-multi-thread
[...]
Binary build 628 provided by ActiveState Tool Corp.
Built 15:41:05 Jul 4 2001
[...]
C> type utf8_thingy.pl
my $x;
BEGIN {
die "Nevermind!" unless $] >= 5.006;
require utf8; "utf8"->import();
$x = chr(338);
$y = $x; $y =~ s/./!/g; print "A: length: ", length $y, "\n"
}
$y = $x; $y =~ s/./!/g; print "B: length: ", length $y, "\n";
C> \perl628\bin\perl utf8_thingy.pl
A: length: 2
B: length: 1
(Whereas if the utf8->import had effect outside of that block, I'd expect
to see "A: length: 2 B: length: 2". Of course, this is noticeable only
under those versions of Perl where use utf8 really did much at all, such
as the old version I use above.)
Interesting - yes I get the same results as you with 5.6.1
I was getting bug reports from beta testers about catenated non-utf8 and
utf8 strings giving the wrong results, so I came up with this as a quick
test:
BEGIN
{
return unless $] >= 5.006;
require utf8;
"utf8"->import();
}
my @month_abbreviations =
(
'janv.',
'févr.',
'mars',
'avr.',
'mai',
'juin',
'juil.',
'août',
'sept.',
'oct.',
'nov.',
'déc.'
);
print $month_abbreviations[7] . "\x{999}\n";
If I comment out the utf8 lines, I get the following output:
aoûtঙ
Uncommented, I get:
aoûtঙ
./perl -v
This is perl, v5.6.1 built for i686-linux
This holds true for 5.8 and 5.6.1 on linux - I'm not sure exactly what's
happening then. Looks like the utf8 pragma has an effect at the package and
block level?
Cheers,
Rich
--
Richard Evans
scriptyrich(_at_)yahoo(_dot_)co(_dot_)uk