perl-unicode

Re: [PATCH @11446] UnicodeCD::charinfo

2001-07-25 08:57:41
On Mon, 23 Jul 2001 13:43:30 -0500
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> wrote:
 
Darn.  Got me there, I am the one always warning people about the fact
that Unicode is not 16 bit anymore :-)

I think we should solve this somehow differently, different, I don't
want to introduce a new huge-ish file (that is just a differently sorted
version of an existing file) to just to do the binary search.

I think the searching method doesn't matter, :-)
so long as it is appropriate and also able to handle
CJK Unified Ideographs and Hangul syllables.

BTW, Hangul syllables must be decomposed canonically, mustn't it?

cf. DerivedDecompositionType-3.1.0.txt in Unicode 3.1

  30FE        ; canonical # Lm       KATAKANA VOICED ITERATION MARK
  AC00..D7A3  ; canonical # Lo [11172] HANGUL SYLLABLE GA
                                     ..HANGUL SYLLABLE HIH
  F900..FA0D  ; canonical # Lo [270] CJK COMPATIBILITY IDEOGRAPH-F900
                                   ..CJK COMPATIBILITY IDEOGRAPH-FA0D

but they are not included in lib/unicode/IsDecoCanon.pl.

and why does lib/unicode/IsCn.pl comprise no characters?
 (see DerivedGeneralCategory-3.1.0.txt)

For example, like this?
# 0x0590 is in the Hebrew block but unused.
-ok($charinfo->{category},      undef);
+ok($charinfo->{category},      'Cn');

regards,
SADAHIRO Tomoyuki
E-mail: bqw10602(_at_)nifty(_dot_)com

<Prev in Thread] Current Thread [Next in Thread>