Re: Use case for utf8::upgrade?

On Wed, Apr 7, 2010 at 17:42, Aristotle Pagaltzis <pagaltzis(_at_)gmx(_dot_)de> 
wrote:

* Michael Ludwig <michael(_dot_)ludwig(_at_)xing(_dot_)com> [2010-04-07 
15:00]:

Having read Juerd's list of useful advice, I don't understand
the reason for its last three items:

• utf8::upgrade before doing lc/lcfirst/uc
• utf8::upgrade before doing case insensitive matching
• utf8::upgrade before matching predefined character classes
  like w and s

Can anyone enlighten me on the background of using
utf8::upgrade here?


Perl versions up to the upcoming 5.12.0 (I think) are buggy in
that they apply ISO-8859-1 semantics to downgraded strings and
Unicode semantics to upgraded strings


This fix was withdrawn from 5.12.0.  Currently you have to "use
feature 'unicode_strings'" to get the sane behaviour in the current
lexical scope.  Current 'perldoc unicode' also says:

       The "use feature 'unicode_strings'" pragma is intended to
       always, regardless of platform, force Unicode semantics in
       a particular lexical scope.  In release 5.12, it is
       partially implemented, applying only to case changes.  See
       "The "Unicode Bug"" below.

This means that the utf8::upgrade() advice also applies to perl-5.12.0.

Regards,
Gisle

                                        , even when they contain the
same data. By upgrading your strings, you make sure that you get
Unicode semantics consistently.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

<Prev in Thread]	Current Thread	[Next in Thread>
Use case for utf8::upgrade?, Michael Ludwig Re: Use case for utf8::upgrade?, Aristotle Pagaltzis Re: Use case for utf8::upgrade?, Michael Ludwig Re: Use case for utf8::upgrade?, Gisle Aas <= Re: Use case for utf8::upgrade?, Aristotle Pagaltzis Re: Use case for utf8::upgrade?, Michael Ludwig Re: Use case for utf8::upgrade?, Aristotle Pagaltzis Re: Use case for utf8::upgrade?, Michael Ludwig