On Friday, Oct 25, 2002, at 14:10 Asia/Tokyo, Philip Newton wrote:
Well, partially because there's no "good" names for many of the
characters. What do you call "?癆潯\xB8"? "CJK UNIFIED IDEOGRAPH-751F"? (That's
the current Unicode "name", but it's not particularly useful.) "CJK
shou"? "CJK sei"? "CJK sheng1"? "CJK saeng"? "CJK ikiru"? ikasu, ikeru,
umareru, umu, ou, haeru, hayasu, ki, nama, naru, nasu, musu, .... which
one do you pick?
If we are stuck with de jure, ex officio names from Unicode Consortium
we are out of luck but this is perl; if there are more than one way to
do it, Why not more than one way to name it? I am kind of wondering a
charnames extension that goes like
use charnames ":ja"; # Japanese
print "\N{sei-ikiru}";
#
use charnames ":ko";
print "\N{saeng}";
#
use charanames ":zh";
print "\N{sheng1}";
Since pragmatic approach is rather inflexible, I would prefer OO
aproach, like
use Char::Name;
my $char = Char::Name->new;
print $char->jp("sei-ikiru");
I know Japanese is the biggest nightmare to name characters because in
Japanese we give too many "names" to each character; It's really hard
to disambiguate these....
I may come up with something as I look though Unihan DB, now accessible
via CPAN (Unicode::Unihan)....
Cheers,
Philip Newton (不衣律不入豚)
\x{5c0f}\x{98fc} \x{5f3e}