perl-unicode

Unicode Normalization Forms

2001-08-08 05:19:23

Now a pre-release module to get Unicode Normalization Forms 
(UAX #15) is available.

see http://homepage1.nifty.com/nomenclator/perl/indexE.htm

NAME (a temporary name)

Text::Unicode::Normalize - normalized forms of Unicode text

SYNOPSIS

  use Text::Unicode::Normalize;

  $stringNFD  = NFD($string);  # Normalization Form D
  $stringNFC  = NFC($string);  # Normalization Form C
  $stringNFKD = NFKD($string); # Normalization Form KD
  $stringNFKC = NFKC($string); # Normalization Form KC

To test this module, it is better to refer
  to NormalizationTest-3.1.1.txt
 ( NormalizationTest-3.1.1d1.zip from
   http://www.unicode.org/Public/BETA/Unicode3.1.1/ )

NormalizationTest-3.1.0.txt seems to have a few bugs.
(exactly speaking, on nine lines)

This module requires the following files:
  unicode/CombiningClass.pl
  unicode/CompExcl.txt
  unicode/Decomposition.pl

and Lingua::KO::Hangul::Util.pm (available via CPAN)

It also runs on Perl 5.6,
even if unicode/*.* are for unicode 3.0.1.

But NormalizationTest of unicode 3.1 requires
those for unicode 3.1.0 in the distribution of Perl 5.7.2.

Regards, SADAHIRO Tomoyuki

<Prev in Thread] Current Thread [Next in Thread>