Re: Unicode::Collate question

Sadahiro Tomoyuki wrote:

So I guess I need a Ligua:XX::Sort module for each language I operate
on,
in my original posting I was misled to believe that Unicode::Collate
would
be the tool to use.

Thanks to all for the useful links provided in this thread.


As far as I found, CPAN provides at least five modules
for collation localized for a specific natural language:
[package name, language name, encoding]

No::Sort, Norwegian, ISO-8859-1
    http://search.cpan.org/~gaas/Norge-1.07/

Cz::Sort, Czech, ISO-8859-2
    http://search.cpan.org/~janpaz/Cstools-3.42/

Lingua::Klingon::Collate, Klingon, ASCII/EBCDIC (Perl native)
    http://search.cpan.org/~pne/Lingua-Klingon-Collate-1.01/

Lingua::JA::Sort::JIS, Japanese, UTF-8
    http://search.cpan.org/~sadahiro/Lingua-JA-Sort-JIS-0.04/

ShiftJIS::Collate, Japanese, Shift-JIS
    http://search.cpan.org/~sadahiro/ShiftJIS-Collate-1.02/

Regards,
SADAHIRO Tomoyuki


Has anyone had a look at the OpenI18N/ICU locale data?

The locales there are all UTF-8 and have java rule based collation data, so
they *might* be useful for creating a more comprehensive (and accurate) set
of sort modules? The downside is this data is pretty rough ATM but does
seem to be improving slowly.

I guess p6 is going to use ICU as the basis for I18N - sure hope the APIs
are easier though :)

Cheers
-- 
Rich
scriptyrich(_at_)yahoo(_dot_)co(_dot_)uk