On Mon, 12 May 2003 Martin_Hosken(_at_)sil(_dot_)org wrote:
says that you should do that in some cases, but doesn't say how. I
have poked around a bit and nothing springs out at me. Is there a
I would use Unicode::Normalize to convert the string to NFC and then delete
the combining characters:
$str = NFC($input);
$str =~ s/\pM//og;
You meant NFD, didn't you? BTW, the proposed update of UTS #10
( http://www.unicode.org/reports/tr10/tr10-10.html) may be of interest
as well. BTW, this is yet a draft and as such needs some refining (for
instance, Hangul Jamo handling is not satisfactory.). Is there any Perl
module that implements Unicode collation as described in UTS #10 or the
collation algorithm specified in ISO 14651-to-be (as it stands) ?
Jungshik