perl-unicode

Re: Caseless and accentless string comparisons

2003-05-11 23:30:06



Dear Ben,

What is the equivalent transformation to remove accents?  perluniintro
says that you should do that in some cases, but doesn't say how.  I
have poked around a bit and nothing springs out at me.  Is there a
preferred way to do this?  Should I decompose the string then remove
the accent characters?  This seems really kludgy so there must be a
better way.

I would use Unicode::Normalize to convert the string to NFC and then delete
the combining characters:

$str = NFC($input);
$str =~ s/\pM//og;

Yours,
Martin