perl-unicode

perl, unicode and databases (mysql)

2002-08-13 05:30:05
Hi all,

I have a perl application (perl 5.8.0) which puts utf8 data in a mysql
database. This seems to work pretty well, and the retrieving of the data
with perl also works. Using something like this:

my $sth = $db_handle->prepare("SELECT some query");
$sth->execute;
my @row=$sth->fetchrow_array;
print $row[0]."\n"; #### print before
if ($]>5.007){
  require Encode;
  Encode::_utf8_on($row[0]);}
print $row[0]."\n"; #### print after
$sth->finish;

The Encode utf8_on gives me back good data. As far as i understood the
_utf8_on method doesnt do any real conversions, but only switches the utf
flag of a perl string?

If you compare the two prints in above example, then it seems that after the
utf flag is set the string is utf decoded. This results in the correct
string, so it seems the original string is utf encoded (double encoded,
since it already was UTF).

When i select the same string manually (mysql prompt) or with PHP, then i
get back the double encoded string. So it seems to me that the double
encoded format is how perl stores it internally (and also in the database)?
But this doesnt sound right to me...this would mean that everytime a utf
flagged string is used it would need to be utf decoded. That sounds not very
effecient to me, so i doubt its done that way. But meanwhile i have no idea
how its done...and how its stored in the database.

As you might have guessed i want to access the data i put in the database
with PHP, but i get back double utf encoded data there. The problem could be
in alot of different places, for example my fetching in PHP, storing in perl
and maybe somewhere else where i have some faulty conversion. To check if
the data in the database is correct i tried to figure out how perl works
with the data.

Maybe someone could put me on the right track, because this got me mighty
confused ;-)

Oh yeah, one other thing, since Encode::_utf8_on is a internal function,
wouldn't it be better to use Encode::decode("utf8",$row[0]) instead? As far
as i can see, it should do exactlyy the same, but if i am mistaken, let me
know :)

Thank you,

Merijn van den Kroonenberg

e-factory bv
Tel.: +31 (0)475 - 340 975
Fax: +31 (0)475 - 320 351
Web: www.e-factory.nl


<Prev in Thread] Current Thread [Next in Thread>