perl-unicode

\p{IsBogus} vs. exception

2006-03-06 09:13:02
Porters,

One of my blog friend came across with this. When you feed perl a bogus unicode property, it raises an exception like 'Can't find Unicode property definition "Bogus" at uniprop.pl line 17.
'.  But this does not always work.  Here is a sample code.

#!/usr/local/bin/perl
my $count = 0;
sub p { print $count++, " : ", (map {s/\n//g; $_ } @_), "\n" } # 4 convenience;
p eval{ ""    =~ /\p{IsBogus}/      }, $@; # no exception
p eval{ "str" =~ /\p{IsBogus}/      }, $@; # exception
p eval{ "str" =~ /^\p{IsBogus}/     }, $@; # exception
p eval{ "str" =~ /\p{IsBogus}$/     }, $@; # exception
p eval{ "str" =~ /\A\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /\p{IsBogus}\Z/    }, $@; # exception
p eval{ "str" =~ /\b\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /\w\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /A\p{IsBogus}/     }, $@; # no exception
p eval{ "str" =~ /[A-Z]\p{IsBogus}/ }, $@; # no exception
__END__

Seems like any flavor of perl 5.8 and above behave that way.

I have told my friend to use a small code like this to tell if the property name exists but that's an ad-hoc solution.

sub has_unicode_property{
  my $prop = shift;
  my $re = qr/\p{$prop}/;
  eval { 'string' =~ $re }; # must match against non-empty string
  return $@ ? 0 : 1;
}

I have checked the code and it's done at utf8.c (Perl_swash_init()) and lib/utf8_heavy.pl but that's how far I could go in 1/4 hours or so.

Dan the Perl5 Porter

<Prev in Thread] Current Thread [Next in Thread>