perl-unicode

\W and [\W]

2003-12-31 08:30:09
Hi list,

Can anyone enlighten me as to why \W behaves differently depending
on wether it's inside or outside of a character class, for certain
characters:

This sample program:

use encoding 'utf8';
$x = 'Großbritannien';
$\ = "\n";
print '1 ', $x =~ /(\W+)/;
print '2 ', $x =~ /([\W]+)/;
print '3 ', $x =~ /(\w+)/;

...prints:

1
2 ß
3 Großbritannien

I do not understand why the Eszett matches [\W] in #2. Same behavior
if I replace the Eszett with another, non ASCII, "letter", e.g. "é".

--
Eric Cholet

<Prev in Thread] Current Thread [Next in Thread>
  • \W and [\W], Eric Cholet <=