Re: Unicode Normalization Forms


On Thu, 09 Aug 2001 22:30:16 +0200
Bjoern Hoehrmann <derhoermi(_at_)gmx(_dot_)net> wrote:

* SADAHIRO Tomoyuki wrote:

How about the following interface?

| $normalized_string = normalize($raw_string)
|
| You can use this function only if the normalization form
| you require is specified in the C<use> statement: 
|
|   use Text::Unicode::Normalize 'C'; # Normalization Form C


Also fine, but this would lack of the ability to switch from one form to
another. The normalize() function could have a second parameter that
takes a form name.


To implement C<normalize> with two parameters, 
it need to be determined how to catch exception
when an invalid form name is passed in.

1) croak
2) carp and return false
3) only return false
4) use default (what is default?)
5) another... 

I think it should be croaked like "Invalid type in pack"

If C<normalize> takes two parameters,
it may be better the 1st is a form name
and the 2nd is a string to be normalized.

cf. printf FOMAT, LIST
    pack TEMPLATE,LIST
    split REGEX, STRING

What's wrong with Unicode::Normalize?


As yet I don't not know
whether the Unicode:: category is available at present
and neither what another name is appropriate.  :-(

Regards, SADAHIRO Tomoyuki

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: UTF-8 in web pages, Martin Duerst

Next by Date:

please test Text::Unicode::Normalize + Sort::UCA, Jarkko Hietaniemi

Previous by Thread:

Re: Unicode Normalization Forms, Bjoern Hoehrmann

Next by Thread:

Re: Unicode Normalization Forms, SADAHIRO Tomoyuki

Indexes:

[Date] [Thread] [Top] [All Lists]