perl-unicode

Re: \W and [\W]

2004-01-02 13:30:05
Do negated classes work at all ?
What does /[^\w]/ do ?

(I looked at this stuff ages ago and I thought unicode classes (including
 negated ones worked, if that is true then fix may just be the magical
 \W expander expanding to wrong thing...)

I think it's the evil characters in the 0x80..0xFF range that can still bit one in the nether parts since they can be "legacy" or Unicode, depending, and one part of the regex machinery gets it wrong, and as the bug report by Andreas quoted Hugo, one can't really trivially fix the problem. In this particular case it's the sharp s that's the trouble maker. IIRC the problem lies in how the character classes are implemented, and how they have "dual brains", one eight-bit and one Unicode, and in this case the reptile legacy brain fires it neuron(s?) too early, before the Unicode brain can engage itself. Or something like that. If we medicated the legacy brain not to fire, other
tests involving characters in that range started failing.


--
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen


<Prev in Thread] Current Thread [Next in Thread>