perl-unicode

[Encode] decode('alias', $1) goes wild

2003-02-05 12:30:04
Dear Porters,

  SUZUKI Norio has reported me an interesting bug in Encode.  See this.

#!
use Encode;
$_ = "eeeee" ;
while (/(e)/g) {
        my $utf = decode('Shift_JIS', $1);
        print "position:",pos,"\n";
}
__END__

This one screws up but decode('shiftjis', $1) does work. Why? Evil 'local $_' in find_alias() in Encode::Alias. This one has been there before I took over the maintenance of Encode for the sake of this;

perldoc Encode::Alias
As a code reference, e.g.:
define_alias( sub { return /^iso8859-(\d+)$/i ? "iso-8859-$1" : undef } , '');
          In this case, $_ will be set to the name that is being
           looked up and ENCODING is passed to the sub as its
           first argument.  The example is another way to alias
           names as used in X11 fonts to the MIME names for the
           iso-8859-* family.

"my"ing that will breach this example. here is a sample script to test this feature

#!use Encode;
use Encode::Alias;
# $Encode::Alias::DEBUG = 1;
define_alias( sub { return /^isolat-(\d+)$/i ? "iso-8859-$1" : undef } , '');
print find_encoding("isolat-1")->name, "\n"; # iso-8859-1
__END__

> perl t/coderef-alias.pl
iso-8859-1
shiftjis
> perl -Mblib t/coderef-alias.pl
Can't call method "name" on an undefined value at coderef-alias.pl line 8.

However, this alias-via-coderef feature is hardly ever used and decode('alias', $1) ought to be far more common. So I will change this feature if not remove it.

The following patch removes 'local $_' so alias-via-coderef will no longer work AS DOCUMENTED. However, you can still alias-via-coderef as follows, which methinks is much cleaner.

define_alias( sub { shift =~ /^isolat-(\d+)$/i ? "iso-8859-$1" : undef } );

Dan the Encode Maintainer

P.S. I still need a little bit more time for $Encode::VERSION++. My load average is too high. Someone interested in implementing 'use less q(time)' ?

--- lib/Encode/Alias.pm 2002/10/06 03:27:02     1.34
+++ lib/Encode/Alias.pm 2003/02/05 18:32:48
@@ -20,38 +20,38 @@
 sub find_alias
 {
     my $class = shift;
-    local $_ = shift;
-    unless (exists $Alias{$_})
+    my $find = shift;
+    unless (exists $Alias{$find})
     {
-        $Alias{$_} = undef; # Recursion guard
+        $Alias{$find} = undef; # Recursion guard
        for (my $i=0; $i < @Alias; $i += 2)
        {
            my $alias = $Alias[$i];
            my $val   = $Alias[$i+1];
            my $new;
-           if (ref($alias) eq 'Regexp' && $_ =~ $alias)
+           if (ref($alias) eq 'Regexp' && $find =~ $alias)
            {
                $DEBUG and warn "eval $val";
                $new = eval $val;
-               # $@ and warn "$val, $@";
+               $DEBUG and $@ and warn "$val, $@";
            }
            elsif (ref($alias) eq 'CODE')
            {
-               $DEBUG and warn "$alias", "->", "($val)";
-               $new = $alias->($val);
+               $DEBUG and warn "$alias", "->", "($find)";
+               $new = $alias->($find);
            }
-           elsif (lc($_) eq lc($alias))
+           elsif (lc($find) eq lc($alias))
            {
                $new = $val;
            }
            if (defined($new))
            {
-               next if $new eq $_; # avoid (direct) recursion on bugs
+ next if $new eq $find; # avoid (direct) recursion on bugs
                $DEBUG and warn "$alias, $new";
my $enc = (ref($new)) ? $new : Encode::find_encoding($new);
                if ($enc)
                {
-                   $Alias{$_} = $enc;
+                   $Alias{$find} = $enc;
                    last;
                }
            }
@@ -59,14 +59,14 @@
     }
     if ($DEBUG){
        my $name;
-       if (my $e = $Alias{$_}){
+       if (my $e = $Alias{$find}){
            $name = $e->name;
        }else{
            $name = "";
        }
-       warn "find_alias($class, $_)->name = $name";
+       warn "find_alias($class, $find)->name = $name";
     }
-    return $Alias{$_};
+    return $Alias{$find};
 }

 sub define_alias

<Prev in Thread] Current Thread [Next in Thread>