perl-unicode

Re: Unicode Normalization Forms

2001-08-10 19:36:18
Hello, everyone.

Unicode::Normalize 0.03 is unloaded.

NAME
Unicode::Normalize - normalized forms of Unicode text

SYNOPSIS

  use Unicode::Normalize;

  $string_NFD  = NFD($raw_string);  # Normalization Form D
  $string_NFC  = NFC($raw_string);  # Normalization Form C
  $string_NFKD = NFKD($raw_string); # Normalization Form KD
  $string_NFKC = NFKC($raw_string); # Normalization Form KC

   or

  use Unicode::Normalize 'normalize';

  $string_NFD  = normalize('D',  $raw_string); # Normalization Form D
  $string_NFC  = normalize('C',  $raw_string); # Normalization Form C
  $string_NFKD = normalize('KD', $raw_string); # Normalization Form KD
  $string_NFKC = normalize('KC', $raw_string); # Normalization Form KC

e.g. you say

    use Encode;
    use Encode::Tcl;
    use Unicode::Normalize qw(normalize);

    print encode 'shiftjis',
          normalize 'KC',
          decode 'shiftjis', $_ while <$SJIS>;

and get normalized *shiftjis* text.

i.e. full-width digits and latin letters, half-width kana, etc. 
are converted to their normal compatibility equivalents.

Regards, SADAHIRO Tomoyuki


<Prev in Thread] Current Thread [Next in Thread>