perl-unicode

Re: Silence “Wide character” warning globally one time

2010-07-29 18:17:29
Hi Dan,

[Silence “Wide character” warning globally one time]
Dan Muey schrieb am 29.07.2010 um 16:59 (-0500):

I've a situation where a large code base will be outputting "byte
strings" and "unicode strings" from a number of sources.

All lumped together? This will likely mean encoding issues.

I essentially need to do 
     no warnings "utf8";

That's hiding the problem.

[ -- Problem: Unicode string gives warning -- ]

Sorry, but no, the warning is not the problem: it's an indication to
the user making him aware of the actual problem, which is printing wide
characters to a single-byte (narrow) output handle.

    perl -le 'print "Think before you code™ (bytes string)";print "Hello 
\x{201C}World\x{201D} (Unicode String)";'

  perl -CO -le 'print "Hello \x{201C}World\x{201D}";'

See: perldoc perlrun

You need the equivalent of -CO in your script:

  binmode STDOUT, ':utf8';

    perl -le 'binmode STDOUT, "utf8";print "Think before you code™ (bytes 
string)";print "Hello \x{201C}World\x{201D} (Unicode String)";'

You're mungling bytes and Unicode characters together. The result is, of
course, wrong. Pick either bytes or Unicode as your standard encoding.
Convert input accordingly.

Is there any super voo doo that can be done?

The best voodoo is understanding Perl's Unicode handling by reading
Juerd's pages [1], which have also been included in the docs of current
Perl versions; so read those as well; but don't read the old docs, they
do contain some documentation bugs.

You might also want to read the archives of this list to see how I
managed to make some progress in understanding thanks to the good
answers I got here.

[1] http://juerd.nl/perl
-- 
Michael Ludwig