perl-unicode

Re: Automagical :text layer (was: My favorite bug to fix for 5.8.0)

2002-03-10 16:33:50
On Mon, Mar 11, 2002 at 01:27:31AM +0200, Jarkko Hietaniemi wrote:
BTW, what is a good regexp to match UTF8 bytes?  Every time I look at RFC 
2279 (or p47 of the Unicode Standard 3.0 book), I feel stupider and 
stupider that it's not clearer to me (or alternately, angrier and angrier 
that the spec-writers didn't make this clearer).  In perlpodspec, I wrote:

cut-and-paste from the latest utf8.h:

 The following table is from Unicode 3.2.

 Code Points            1st Byte  2nd Byte  3rd Byte  4th Byte

   U+0000..U+007F       00..7F   

It seems that I have something funny in utf8.h after then 7F here,
so you may see something funny, too...

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen