perl-unicode

Re: converting SJIS to UTF-8

2001-01-04 09:42:21
On Wed, 3 Jan 2001, Chris Mealy wrote:



Based on the perl unicode faq, I expected this to work:

sub shift_jis_to_utf8 {
    my ($shift_jis) = @_;

    my $encoding = 'Shift-JIS';
    my $Map = new Unicode::Map({ ID => $encoding });
    my $map_out = $Map->to_unicode($shift_jis);
    my $us = Unicode::String->new($map_out);
    my $us_utf8 = $us->utf8;

    return $us_utf8;
}

but it doesn't.  What's the right way to convert SJIS to UTF-8 with perl?

Go up an abstraction layer. The modules 'Unicode::Map', 'Unicode::Map8'
and 'Jcode' cover different sections of the Unicode problem - none of them
covers all of it (for the section you are in, 'Jcode' is the relevant low
level module rather than Unicode::Map). 'Unicode::MapUTF8' handles the
problem by abstracting the disparate implementations away from the
programmer into a unified API layer:

use Unicode::MapUTF8 qw(to_utf8);

my $utf8_string = to_utf8({ -string => $sjis_string, 
                           -charset => 'sjis',
                         });

-- 
Benjamin Franz

... with proper design, the features come cheaply. This 
approach is arduous, but continues to succeed.

                                     ---Dennis Ritchie

<Prev in Thread] Current Thread [Next in Thread>