Chaim Frenkel <chaimf(_at_)pobox(_dot_)com> writes:
Would this add yet another label to all CPAN modules?
Thread: (Unsafe, Safe, Aware)
UTF8: (Unsafe, Safe, Aware)
We're turning into C again, but it may be unavoidable.
Seems to me like either way of dealing with UTF8 is going to result in
potential bugs. If we allow a global pragma, some modules will continue
to work fine with it set and others will be doing byte-level things in
regexes and other places and die badly. If we require each module to
declare its UTF8-awareness, then the ones that do will probably work fine
but most never will and your UTF8 data may have odd things happen to it
when it comes anywhere near those modules.
I find myself wanting some clear idea of how much stuff can potentially
break if a routine not written with use utf8 in mind suddenly finds itself
operating in that environment. Putting myself forward as a "typical Perl
programmer and module writer whose never had to deal with Unicode before,"
I actually have no idea what exactly utf8 will do to me. Sure, there's
documentation, but there's an additional conceptual leap needed too.
Maybe more specific perltrap-like examples.... (It's possible that I'm
just out of the loop and someone already has plenty of those.)
Hopefully for threads we'll be able to come up with a threading model that
*doesn't* require us to go through all existing Perl code and classify its
MT safeness level like you have to do in C....
Russ Allbery (rra(_at_)stanford(_dot_)edu)