On Mon, Mar 11, 2002 at 01:27:31AM +0200, Jarkko Hietaniemi wrote:
BTW, what is a good regexp to match UTF8 bytes? Every time I look at RFC
2279 (or p47 of the Unicode Standard 3.0 book), I feel stupider and
stupider that it's not clearer to me (or alternately, angrier and angrier
that the spec-writers didn't make this clearer). In perlpodspec, I wrote:
cut-and-paste from the latest utf8.h:
The following table is from Unicode 3.2.
Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte
U+0000..U+007F 00..7F
It seems that I have something funny in utf8.h after then 7F here,
so you may see something funny, too...
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen