Sadahiro Tomoyuki wrote:
So I guess I need a Ligua:XX::Sort module for each language I operate
on,
in my original posting I was misled to believe that Unicode::Collate
would
be the tool to use.
Thanks to all for the useful links provided in this thread.
As far as I found, CPAN provides at least five modules
for collation localized for a specific natural language:
[package name, language name, encoding]
No::Sort, Norwegian, ISO-8859-1
http://search.cpan.org/~gaas/Norge-1.07/
Cz::Sort, Czech, ISO-8859-2
http://search.cpan.org/~janpaz/Cstools-3.42/
Lingua::Klingon::Collate, Klingon, ASCII/EBCDIC (Perl native)
http://search.cpan.org/~pne/Lingua-Klingon-Collate-1.01/
Lingua::JA::Sort::JIS, Japanese, UTF-8
http://search.cpan.org/~sadahiro/Lingua-JA-Sort-JIS-0.04/
ShiftJIS::Collate, Japanese, Shift-JIS
http://search.cpan.org/~sadahiro/ShiftJIS-Collate-1.02/
Regards,
SADAHIRO Tomoyuki
Has anyone had a look at the OpenI18N/ICU locale data?
The locales there are all UTF-8 and have java rule based collation data, so
they *might* be useful for creating a more comprehensive (and accurate) set
of sort modules? The downside is this data is pretty rough ATM but does
seem to be improving slowly.
I guess p6 is going to use ICU as the basis for I18N - sure hope the APIs
are easier though :)
Cheers
--
Rich
scriptyrich(_at_)yahoo(_dot_)co(_dot_)uk