perl-unicode

Re: Unicode::Collate, useful but useless

2007-04-15 02:51:09
On 12 Apr 2007 15:36:31 -0000, Rafael Garcia-Suarez wrote

Éric Cholet wrote in perl.unicode :

Okay, I know, it wants a Unicode Collation Element Table, it's well  
documented in the pod where to get such a table.
But:
- it wants this file to be in $foo/Unicode/Collate/allkeys.txt where  
$foo is in @INC, which is very inconvenient, and even impossible to  
achieve in some hosted environments.
- It seems the default table (DUCET) isn't bundled with Perl because  
of its size (1.2 Mb). Maybe "1.2Mb is too big" could be reconsidered  
w.r.t today's environment. At any rate does it really make sense that  
Unicode::Collate is in core but unusable as is?

Well duh :
    $ du -hs p4blead
    73M     p4blead
So yes. If Sadahiro agrees, I'll add it.

Absolutely I agree that it will be added to perl-current,
since a DUCET has been included in the CPAN-ized version.
The documentation of U::C should make sense whether
DECET would be installed with the perl core or not.

While the lastest DUCET(*) is for Unicode 5.0.0,
  (*)http://www.unicode.org/Public/UCA/latest/allkeys.txt
the latest U::C on CPAN has the DUCET for Unicode 4.1.0.

However I won't upload U::C with DUCET for Unicode 5.0.0
until perl 5.8.9 with Unicode Character Database 5.0.0
will have been released. The reason is:

- Due to the stability of the Unicode normalization,
  perl 5.8.9/5.9.5 with U.C.D.5.0.0 and DUCET for 4.1.0
  can perform conformantly the collation for 4.1.0.
- As DUCET doesn't keep backward compatibility, the latest
  maint (5.8.8) with U.C.D. 4.1.0 and DUCET for 5.0.0 is
  conformant neither with 4.1.0 nor with 5.0.0.

Hence my intention is to upload a newer U::C with DUCET
for 5.0.0 onto CPAN after release of perl 5.8.9 with
U.C.D. 5.0.0 as a newer maint-perl.

Speaking technically perl-current (and perl-5.8.x) can have
either of DUCET for 4.1.0 and that for 5.0.0.
If perl 5.8.9/5.9.5 will have DUCET for 4.1.0, it can do the
collation for 4.1.0, but not for 5.0.0. If perl 5.8.9/5.9.5
will have DUCET for 5.0.0, it can do the collation 5.0.0
taking advantage of its U.C.D. 5.0.0.

Regards,
SADAHIRO Tomoyuki


<Prev in Thread] Current Thread [Next in Thread>