perl-unicode

[Encode] use encoding q(euc-jp) [was: re: Charset-0.01 released]

2002-03-30 00:42:35
On Saturday, March 30, 2002, at 12:30 , Autrijus Tang wrote:
On Sat, Mar 30, 2002 at 10:49:38AM +0900, Tatsuhiko Miyagawa wrote:
I first thought so but the temptation to
   perl -MCharset=your-encoding -e ....
was too insatiable.
In this situation, pragma (I mean charset.pm) would be more
reasonable. Just a thought.

And then you'll ahve to disambiguate between that and encoding.pm...
Why aren't we extending encoding.pm instead?

I have to confess I have overlooked encoding.pm but hey, look at this! Perl seems to have gotten ahead of it!

perldoc5.7.3 -m encoding
sub import {
    my ($class, $name) = @_;
    $name = $ENV{PERL_ENCODING} if @_ < 2;
    $name = "latin1" unless defined $name;
    my $enc = find_encoding($name);
    unless (defined $enc) {
        require Carp;
        Carp::croak "Unknown encoding '$name'";
    }
    ${^ENCODING} = $enc;
}

Is this yet another wizardly of yours, NI-S? (But the indentation says otherwise; EBCDIC croaking device detected. Yours, jhi)? All it does is assign $Encoding object to ${^ENCODING}! This is not documented in perlvar. If this can handle Encode object, there is no reason it cannot handle others.

Let me see if it works.... Holy smoke! It does! Test script after the sig. It is in Japanese so save it as euc-jp if you can. Well, Charset handles scopes (well, badly, Filter::Simple limitation) and "no Charset" and IO. But it is good to know the same thing is available on perl core. 14 years and still new discoveries.... Hope my married life would go that way too (well, it's definitely less than Perl History)

Dan the Encoded Man

# Save me in euc-jp
# Snatched form Charset/t/1_jperl.t

use strict;
use Test::More tests => 4;
my $Debug = shift;
use encoding "euc-jp";
# use Charset "euc-jp", DEBUG => $Debug;

my $Namae = "小飼 弾";   # in Japanese, in euc-jp
my $Name  = "Dan Kogai"; # in English
my $str = $Namae; $str =~ s/小飼 弾/Dan Kogai/o;
is($str, $Name);
is(length($Namae), 4);
{
    use bytes;
    is(length($Namae), 10); # 3*3+1
    my $euc = Encode::encode('euc-jp', $Namae);
    is(length($euc),   7); # 2*3+1
}
__END__