perl-unicode

Re: Weird interaction of ord, split, and substr with UTF-8?

2000-10-31 13:01:34
On Tue, 31 Oct 2000 08:57:45 -0800, Paul Hoffman 
<phoffman(_at_)proper(_dot_)com> said:

Thanks. However, I can't find a patch at
<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/> that
seems related to the bug. I searched for "utf-8 ord". Is there a
patch number you can give me?

7164.

Also, I'd like to distribute my code to others who probably won't
have a patched system. Thus, I'd love to find a way, even a kludgy
way, in 5.6.0 to split up a string into utf-8 characters that will
work with ord. If need be, I could even use Unicode::String,
convert to a UCS-4, slice into four-octet chunks, then convert them
back to a UTF-8, but I'd like something less ugly to show the
public.

I'd highly recommend falling back to Unicode::String, there are too
many bugs in all perls since the model was changed from marking code
to marking strings. You do not need UCS-4 for your example, there is
$u->substr and $u->ord!

-- 
andreas