Two Unicode Support Issues

Greetings All,

I encountered a case where utf8 did not work as expected and thought I
should report it here.  The problem occured with the 5_62 development
release:


#!/usr/bin/perl

use utf8;

foreach $i (a..b) {
  print "$i\n";
}

__END__


the above worked fine of course, it is when I changed 'a' to 0x1200 and
'b' to 0x137C (in utf8 form) that perl spat out some "bad character
error".  In other contexts I encountered no problems.

The next issue I encountered when using \p{InEthiopic} which give a
positive response for anything in the range 0x1200 - 0x137F.  While this
is valid for the "Ethiopic Range" in Unicode not everything in the range
is valid Ethiopic.  There are a number of undefined positions in the
field, around 37 or so that I had wished to avoid.

I was lead to modify the In/Ethiopic.pl script to step around the
undefined characters.  What is the policy here?  What was the original
intention of the "In" property?  I think this problem must come up often
with other scripts.

thanks,

/Daniel

Previous by Date:	Re: Bidi reordering results - final, Mark Leisher
Next by Date:	Re: Two Unicode Support Issues, Larry Wall
Previous by Thread:	Bidi reordering results - final, Mark Leisher
Next by Thread:	Re: Two Unicode Support Issues, Larry Wall
Indexes:	[Date] [Thread] [Top] [All Lists]