Re: Inverse of /\p{script}/

On Thursday, Aug 28, 2003, at 23:16 Asia/Tokyo, nick(_at_)ing-simmons(_dot_)netwrote:

Does the existing perl5.8.* Unicode support have a way to efficently

determine which script(s) or block (in unicode sense) a code pointbelongs

to?

In Unicode-aware Tk I am still doing battle with mechanism to select
X11 font to display a particular codepoint (for now glossing over
glyph vs character issues).
The present code is still rather dumb.


That's what Encode::InCharset is for.  Available via CPAN.

http://search.cpan.org/author/DANKOGAI/Encode-InCharset-0.03/

It seems to make sense to have a hash which maps script names to
probable (font) encodings

 (Hiragana | Katakana | Han) => 'jisx0208.1990-0'


The module makes it \p{InJIS0208} ...

 (Greek)                     => 'iso8859-7',


And \p{InISO_8859_7}, respectively.

So give a (1 character) string how do I get Unicode script/block it isin?

One caveat, however. It is slightly out of sync w/ the latest Encode.You should stay away from vendor encodings that are thoroughly revisedin Encode 1.75 -> 1.98 (FYI ENcode::InCharset is still based upon 1.75).


Dan the Encode Maintainer

<Prev in Thread]	Current Thread	[Next in Thread>
Re: Inverse of /\p{script}/, (continued) Re: Inverse of /\p{script}/, Dan Kogai Re: Inverse of /\p{script}/, Nick Ing-Simmons Re: Inverse of /\p{script}/, Andreas J Koenig Re: Inverse of /\p{script}/, Jungshik Shin Re: Inverse of /\p{script}/, Nick Ing-Simmons Re: Inverse of /\p{script}/, Owen Taylor Re: Inverse of /\p{script}/, Nick Ing-Simmons Re: Inverse of /\p{script}/, Owen Taylor Re: Inverse of /\p{script}/, Nick Ing-Simmons Re: Inverse of /\p{script}/, Nick Ing-Simmons Re: Inverse of /\p{script}/, Dan Kogai <=