perl-unicode

Re: Unicode Normalization Forms

2001-08-09 07:59:28

 use Text::Unicode::Normalize;

 $stringNFD  = NFD($string);  # Normalization Form D
 $stringNFC  = NFC($string);  # Normalization Form C
 $stringNFKD = NFKD($string); # Normalization Form KD
 $stringNFKC = NFKC($string); # Normalization Form KC

a normalize function instead of these 4 functions with a
form parameter, e.g.

  my $normalized = normalize( form => 'C' );

How about the following interface?

| $normalized_string = normalize($raw_string)
|
| You can use this function only if the normalization form
| you require is specified in the C<use> statement: 
|
|   use Text::Unicode::Normalize 'C'; # Normalization Form C
|
| or for clarity, say:
|
|   use Text::Unicode::Normalize form => 'C'; # Normalization Form C
|
| and you can use C<normalize> as an alias for C<NFC> or another:
|
|   $normalized_string  = normalize($raw_string);
|
| As the form name, one of the following names will be accepted.
|
|   'C'  or 'NFC'  for Normalization Form C
|   'D'  or 'NFD'  for Normalization Form D
|   'KC' or 'NFKC' for Normalization Form KC
|   'KD' or 'NFKD' for Normalization Form KD

It should accept the short forms (C, D, KC, KD) as alias
for the NF.. There is currently no Text::Unicode tree on
CPAN but there is a Unicode:: tree and it fits quite well
there. The normalize function increases readabilty and
looks nicer.

Is it expectable, that Perl will normalize everything it
puts out by itself or will we have to use this module?

Regards, SADAHIRO Tomoyuki

<Prev in Thread] Current Thread [Next in Thread>