perl-unicode

Re: Caseless and accentless string comparisons

2003-05-12 08:30:05

On Mon, 12 May 2003 07:46:37 -0400 (EDT)
Jungshik Shin <jshin(_at_)mailaps(_dot_)org> wrote:

  You meant NFD, didn't you?  BTW, the proposed update of UTS #10
( http://www.unicode.org/reports/tr10/tr10-10.html) may be of interest
as well. BTW, this is yet a draft and as such needs some refining (for
instance, Hangul Jamo handling is not satisfactory.). Is there any Perl
module that implements Unicode collation as described in UTS #10 or the
collation algorithm specified in ISO 14651-to-be (as it stands) ?

Unicode::Collate is for UCA by UTS #10.

If I understanded it correctly,
Trailing Weights would not make a Hangul syllable
(LVT, LV, LLVT, etc. and with mark) being one collation grapheme,
as long as each of L, V, and T is non-ignorable
and UCA lacks a protocol that allows a sequence of plural
non-ignorable CE's to be regarded as one collation grapheme.

Cf. Collation Grapheme
http://www.unicode.org/reports/tr10/tr10-10.html#Collation_Graphemes

SADAHIRO Tomoyuki