Need: list of Unicode characters that have canonical decompositions.

A project I'm working on needs to build a list of all Unicode charactersthat have canonical decompositions. The most efficient ways I can thinkof to get such a list are from unicore/Decomposition.pl or by scanningunicore/UnicodeData.txt. However:


Re unicore/Decomposition.pl, the header of this says:

# !!!!!!!   INTERNAL PERL USE ONLY   !!!!!!!
# This file is for internal use by the Perl program only.  The format and even
# the name or existence of this file are subject to change without notice.
# Don't use it directly.

Re unicore/UnicodeData.txt, I've recently posted a version of my modulethat uses unicore/UnicodeData.txt to CPAN, and from Perl 5.14 testersI've received only failure notices which indicate that the file cannotbe found :-(

Unicode::UCD can tell me if a specific character has a decomposition,but can't give me a list of characters that have decompositions.


Any suggestions would be appreciated.

Bob

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Enumerating all canonically equivalent strings, BobH

Next by Date:

Re: Need: list of Unicode characters that have canonical decompositions., BobH

Previous by Thread:

Enumerating all canonically equivalent strings, BobH

Next by Thread:

Re: Need: list of Unicode characters that have canonical decompositions., BobH

Indexes:

[Date] [Thread] [Top] [All Lists]