I had the same problem, and worked around it by using _utf8_on() from
Encode on the mysql query results. In my version, it was not exported
by default, so I added '_utf8_on' to @EXPORT .
However, the Encode documentation states that utf8_on is an internal
function, and "Do not use unless you know that the STRING is well-formed
UTF-8."
Is there a better way to do this?
Also, as a suggestion to the authors/documentors of Encode: it would
be helpful to have more explanation of (& warnings about) the UTF-8
flag,
how/why it works, functions that manipulate it, and warnings about
common
problems, such as the current one.
Mark
-----Original Message-----
From: Brigitte Jellinek [mailto:bjelli(_at_)horus(_dot_)at]
Sent: 2003 06 16 8:37
To: perl-unicode(_at_)perl(_dot_)org
Subject: i know it's utf-8, how can i force perl to see it that way
hi!
i'm trying to use perl + dbi + dbd::mysql + mysql with unicode.
as far as i can tell i can write a utf8 string into the database,
and get back the same sequence of bits, only now it's a 'classical'
perl-string, not flagged as utf-8.
the string i write into the db is 6 characters long:
"ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
character unicode utf8
hex binary
A 0041 01000001
B 0042 01000010
c 0063 01100011
greep alpha 03B1 1100111010110001
german scharfes s 00DF 1100001110011111
cyrrillic e 044D 1101000110001101
what i get back from the db is
binary
A 01000001
B 01000010
c 01100011
? 11001110
? 10110001
? 11000011
? 00111111
? 11010001
? 00111111
I have tried to convert this using
$new = decode_utf8( $fromdb );
but all i get is an empty string. is there
some way to find out *why* this won't decode?
or is my debugging stuff that shows me the bits in the
string just wrong:
sub showbits
{
my ($template, $utf, $result, $i);
$utf = is_utf8 $_[0];
$template = $utf ? "U*" : "C*";
foreach ( unpack($template, $_[0] ) )
{
$result .= "\n" ;
$result .= substr( $_[0], $i, 1 ) . "=" . sprintf
("%04X", $_) . "=";
if ( $utf and $_ > 127) {
$b = unpack("B*", substr( $_[0], $i, 1 ));
}
else {
$b = unpack("B*", pack("N", $_ ));
}
$b =~ s/^0{32}//; # leading zeros
$b =~ s/^0{16}//;
$b =~ s/^0{8}//;
$result .= $b;
$i++;
}
return $result;
}
--
Brigitte 'I never met a chocolate I didnt like' Jellinek
bjelli(_at_)horus(_dot_)com
http://www.horus.com/~bjelli/
http://perlwelt.horus.at http://www.perlmonks.org/index.pl?node=bjelli