perl-unicode

Re: iso-2022-jp, adding encodings..

2001-06-15 06:48:47
On Thu, 14 Jun 2001, Edward Peschko wrote:


Ok, I'm a bit confused..

How exactly do you add a new charset map to Unicode::Map? Where do you get 
the 
encodings from? Where are they defined?

I saw your reference to ftp://ftp.unicode.org/MAPPINGS, but that just points
to a file, not a directory of mapping sets.

All I'm trying to do is convert from UTF8 to iso-2022-jp ( the form of shift
jis that is used in email...) any help on how to do this would be greatly 
appreciated...

Install 'Unicode::MapUTF8' - it probably does what you want:

my $sjis_string = from_utf8({ -string => $utf8_string, 
                             -charset => 'iso-2022-jp' })

Alternatively, install the 'Jcode' module (Unicode::MapUTF8 forms a
'wrapper' around that and other Unicode modules to provide a single
consistent interface for _all_ Unicode charset convertors).

(ps - the charset that I'm talking about can be found at:

http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html

It would be really, really cool if perl had the same charset codes, or at 
least
an alias to them. That way, one wouldn't have to go through this 'is the 
charset
there' junk. Unfortunately there seems to be 10 aliases for charsets all over
the place.

Yah. That problem is being addressed in the I18N::Charset module. I
intend eventually to make Unicode::MapUTF8 aware of that so it can exploit
the known aliases information.
 
-- 
Benjamin Franz

"Premature optimization is the root of all evil in programming."
                                         ---C.A.R. Hoare


<Prev in Thread] Current Thread [Next in Thread>