perl-unicode

Re: \p{IsBogus} vs. exception

2006-03-06 12:20:06
On Tue, Mar 07, 2006 at 01:12:32AM +0900, Dan Kogai wrote:
Porters,

One of my blog friend came across with this.  When you feed perl a  
bogus unicode property, it raises an exception like 'Can't find  
Unicode property definition "Bogus" at uniprop.pl line 17.
'.  But this does not always work.  Here is a sample code.

#!/usr/local/bin/perl
my $count = 0;
sub p { print $count++, " : ", (map {s/\n//g; $_ } @_), "\n" } # 4  
convenience;
p eval{ ""    =~ /\p{IsBogus}/      }, $@; # no exception
p eval{ "str" =~ /\p{IsBogus}/      }, $@; # exception
p eval{ "str" =~ /^\p{IsBogus}/     }, $@; # exception
p eval{ "str" =~ /\p{IsBogus}$/     }, $@; # exception
p eval{ "str" =~ /\A\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /\p{IsBogus}\Z/    }, $@; # exception
p eval{ "str" =~ /\b\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /\w\p{IsBogus}/    }, $@; # exception
p eval{ "str" =~ /A\p{IsBogus}/     }, $@; # no exception
p eval{ "str" =~ /[A-Z]\p{IsBogus}/ }, $@; # no exception
__END__

Seems like any flavor of perl 5.8 and above behave that way.

Looks to me like it doesn't give an exception when the regex
has already failed before reaching the \p{IsBogus}.

So the property is only checked for validity at the point when it is
actually used.  I'm not sure it would even be desirable to check it
before then (that is, at regcomp-time), remembering that Perl is a
dynamic language.

<Prev in Thread] Current Thread [Next in Thread>