E R skribis 2007-10-19 17:14 (-0500):
The problem I need to understand now is the following:
# using mod_perl 1.28
# note: binmode(STDOUT, ":utf8") has no effect
$r->print($x); # emits 1 octet
$r->print($y); # emits 2 octets
I get similar behavior when storing $y into an Oracle DB - a string of length
2
is stored. Storing $x, however, results in a length 1 string.
These don't use a filehandle, so :utf8 or :encoding layers don't work.
That leaves two options: either use the encoding functionality by the
module (if any), or encode manually.
AFAIK, mod_perl does not provide transparent encoding for output.
DBD::Oracle does, but you need to enable it. (Don't ask me how; I bailed
out when I saw the complexity of Oracle's charset/encoding support.)
When doing the encoding manually, I strongly suggest that you subclass
the module in question, to prevent that the logic is spread all over the
place. (And please release your subclass to CPAN :))
So it seems that in light of this one should always use Encode::encode with
these modules to ensure the data is represented the way you want it.
Encode::encode, Encode::encode_utf8, or utf8::encode.
Stated another way: if you use a module which converts a Perl string to an
octet
sequence, and there is no provision for specifying an encoding, that should
be a
red flag that you need to encode the string before you send it to the module.
Well stated. I have collected a summary at http://juerd.nl/perluniadvice
that is neither complete nor accurate, but it provides more information
than most documentation does. Unfortunately I lack tuits to send bug
reports and make patches.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker <#####(_at_)juerd(_dot_)nl>
<http://juerd.nl/sig>
Convolution: ICT solutions and consultancy
<sales(_at_)convolution(_dot_)nl>