Just replying to my message as I had no answer at all... Was it a
completely silly question? Is there something which I missed out?
Help!
Hi list,
I am running perl 5.6.1 on a redhat box, and I have come across this
wierd (bug|feature|annoying thing). If this problem has been raised
before please give me a reference to the "F" manual :-)
TEST SCRIPT:
============
use strict;
use utf8;
main();
sub main
{
# \x{A9} is the copyright string
#
my $data = "Copyright \x{A9} 2001-2002 MKDoc Ltd";
my $dlm = '(?:\p{IsSpace}|\p{IsPunct})';
my $re = 'MKDoc';
print "BEFORE: $data\n";
my @split = $data =~ /^(.*?$dlm)($re)($dlm.*?)$/ism;
$data = join '', @split;
print "AFTER : $data\n";
}
1;
And here is what I get
[jhiver(_at_)frogette mkdoc]$ perl -w test2.pl
BEFORE: Copyright © 2001-2002 MKDoc Ltd
AFTER : Copyright © 2001-2002 MKDoc Ltd
[jhiver(_at_)frogette mkdoc]$
My terminal doesn't support UTF-8, which in this case is good because I
an see all the caracters... surprise, using regexes capture seems to
remove string utf8ness although the string IS utf8 and 'use utf8' is
there...
--
IT'S TIME FOR A DIFFERENT KIND OF WEB
================================================================
Jean-Michel Hiver - Software Director
jhiver(_at_)mkdoc(_dot_)com
+44 (0)114 221 4968
================================================================
VISIT HTTP://WWW.MKDOC.COM