perl-unicode

Re: Unicode. Perl does the right thing?

2002-10-24 23:30:04
On Friday, Oct 25, 2002, at 14:10 Asia/Tokyo, Philip Newton wrote:
Well, partially because there's no "good" names for many of the
characters. What do you call "?癆潯\xB8"? "CJK UNIFIED IDEOGRAPH-751F"? (That's
the current Unicode "name", but it's not particularly useful.) "CJK
shou"? "CJK sei"? "CJK sheng1"? "CJK saeng"? "CJK ikiru"? ikasu, ikeru,
umareru, umu, ou, haeru, hayasu, ki, nama, naru, nasu, musu, .... which
one do you pick?

If we are stuck with de jure, ex officio names from Unicode Consortium we are out of luck but this is perl; if there are more than one way to do it, Why not more than one way to name it? I am kind of wondering a charnames extension that goes like

use charnames ":ja"; # Japanese
print "\N{sei-ikiru}";
#
use charnames ":ko";
print "\N{saeng}";
#
use charanames ":zh";
print "\N{sheng1}";

Since pragmatic approach is rather inflexible, I would prefer OO aproach, like

use Char::Name;

my $char = Char::Name->new;

print $char->jp("sei-ikiru");

I know Japanese is the biggest nightmare to name characters because in Japanese we give too many "names" to each character; It's really hard to disambiguate these....

I may come up with something as I look though Unihan DB, now accessible via CPAN (Unicode::Unihan)....

Cheers,
Philip Newton (不衣律不入豚)

\x{5c0f}\x{98fc} \x{5f3e}

<Prev in Thread] Current Thread [Next in Thread>