perl-unicode

Re: Encode, charnames and utf8heavy

2002-05-01 08:45:02
On Wednesday, May 1, 2002, at 11:23 , Jarkko Hietaniemi wrote:
perlunicode.pod and "User-defined Character Properties" already
documents it.  I guess accepting \s+ is okay... but as I said,
people shouldn't be doing that by hand (much).

And here is the patch that fixes this. [ \t]+ is picked instead of \s+ because \s+ is too ambiguous with Unicode (plus it catches \n and \r which it should not).

Since Camel 3 doesn't say anything about what whitespace character(s) (is|are) okay (it merely says "like this" -- cf. pp. 173), you should apply this patch for the sake of Camel 3 readers.

$sig =~ /Dan[ \t]+the[ \t]+Perl5[ \t]+Porter/;

> diff -du lib/utf8_heavy.pl.old lib/utf8_heavy.pl --- lib/utf8_heavy.pl.old Mon Apr 22 08:29:37 2002
+++ lib/utf8_heavy.pl   Thu May  2 00:29:18 2002
@@ -271,7 +271,7 @@
        }
        else {
          LINE:
-           while (/^([0-9a-fA-F]+)(?:\t([0-9a-fA-F]+))?/mg) {
+           while (/^([0-9a-fA-F]+)(?:[ \t]+([0-9a-fA-F]+))?/mg) {
                my $min = hex $1;
                my $max = (defined $2 ? hex $2 : $min);
                next if $max < $start;