Andreas J Koenig wrote in perl.unicode :
On Wed, 31 Dec 2003 16:21:36 +0100, Eric Cholet
<cholet(_at_)logilune(_dot_)com> said:
> Can anyone enlighten me as to why \W behaves differently depending
> on wether it's inside or outside of a character class, for certain
> characters:
I have reported this as bug 18281
http://guest:guest(_at_)rt(_dot_)perl(_dot_)org/rt3/Ticket/Display.html?id=18281
I don't think that it is documented by now and I cannot spot a good
place where it needs to be documented. perlre.pod and perlunicode.pod
seem the natural places.
And apparently fixing it is not trivial.
Does something like this suit you ? This can at least make its way into
5.8.3.
Change 22031 by rgs(_at_)rgs-home on 2004/01/01 16:30:13
Document that /[\W]/ doesn't work, unicode-wise (see bug #18281)
Affected files ...
... //depot/perl/pod/perlunicode.pod#130 edit
Differences ...
==== //depot/perl/pod/perlunicode.pod#130 (text) ====
@@ -166,6 +166,10 @@
Unicode properties database. C<\w> can be used to match a Japanese
ideograph, for instance.
+(However, and as a limitation of the current implementation, using
+C<\w> or C<\W> I<inside> a C<[...]> character class will still match
+with byte semantics.)
+
=item *
Named Unicode properties, scripts, and block ranges may be used like