perl-unicode

Re: Caseless and accentless string comparisons

2003-05-12 01:30:05
Martin Hosken <Martin_Hosken(_at_)sil(_dot_)org> writes:
Dear Ben,

What is the equivalent transformation to remove accents?  perluniintro
says that you should do that in some cases, but doesn't say how.  I
have poked around a bit and nothing springs out at me.  Is there a
preferred way to do this?  Should I decompose the string then remove
the accent characters?  This seems really kludgy so there must be a
better way.

I would use Unicode::Normalize to convert the string to NFC and then delete
the combining characters:

But I thought NFC had some composed accented chars?


$str = NFC($input);
$str =~ s/\pM//og;

Yours,
Martin
-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/