perl-unicode

Re: possible regexp feature for 5.6: "ignore diacritics"

1999-10-18 02:11:31
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> writes:
Russ Allbery writes:

Remind me; under POSIX, isn't [=ss=] supposed to work and match ß?

But this is Ilya's \N{}, right? :-)

Honestly, I don't remember.  Darn it, I must get my hands to a copy of
1003.2...

Found where I'd read about it.  Friedl, pp. 80-81.  Looks like we're okay
for character equivalents, although there is a mention to multi-character
names being allowed for character equivalents.  The tricky one is
collating sequences, which include [.span-ll.] and [.eszet.], both of
which can potentially match multiple characters (which I would guess would
cause severe pain to the regex engine).

-- 
Russ Allbery (rra(_at_)stanford(_dot_)edu)         
<URL:http://www.eyrie.org/~eagle/>