perl-unicode

Re: Use case for utf8::upgrade?

2010-04-07 10:44:35
* Michael Ludwig <michael(_dot_)ludwig(_at_)xing(_dot_)com> [2010-04-07 15:00]:
Having read Juerd's list of useful advice, I don't understand
the reason for its last three items:

• utf8::upgrade before doing lc/lcfirst/uc
• utf8::upgrade before doing case insensitive matching
• utf8::upgrade before matching predefined character classes
  like w and s

Can anyone enlighten me on the background of using
utf8::upgrade here?

Perl versions up to the upcoming 5.12.0 (I think) are buggy in
that they apply ISO-8859-1 semantics to downgraded strings and
Unicode semantics to upgraded strings, even when they contain the
same data. By upgrading your strings, you make sure that you get
Unicode semantics consistently.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>

<Prev in Thread] Current Thread [Next in Thread>