perl-unicode

Re: how to sort by stroke (not radical/stroke)

2003-05-13 12:30:06
On Tuesday, May 13, 2003, at 11:48  PM, John Jenkins wrote:
Stroke order, then, is something
different. Seems like we would need order entries in the config data
   for every character, which would be totally unmanageable.

I didn't have any luck searching the Unicode web site for information
about sorting by stroke.


There is a kTotalStrokes field in Unihan.txt, although it doesn't cover every character in Unihan. This would definitely be a good place to start.

If you are using Perl 5.6.0 or higher (5.8.0 recommended), you can use Unicode::Unihan module available via CPAN. Let me show you a small example.

#!/usr/local/bin/perl
use strict;
use Unicode::Unihan;
my $uh      = Unicode::Unihan->new;
my $str     = "\x{5c0f}\x{98fc}\x{5f3e}"; # my name in Kanji
my @chars   = map {chr($_)} unpack("U*" => $str);
my @strokes = $uh->TotalStrokes($str);
my %c2s;      @c2s{(_at_)chars} = @strokes;
binmode STDOUT => ':utf8';
for my $char (sort {$c2s{$a} <=> $c2s{$b} || $a cmp $b} @chars){
    print "$char => $c2s{$char}\n";
}
__END__

And here is what it prints.

小 => 3
弾 => 12
飼 => 14

I am not sure if Unicode::Unihan is robust enough for the practical use but IMHO it is a handy place to start.

Dan the Perl5 Porter

<Prev in Thread] Current Thread [Next in Thread>
  • Re: how to sort by stroke (not radical/stroke), Dan Kogai <=