Honorable Perl Hackers,
I'm really desperate and I must apologize for posting without lurking.
My problem is as follows:
I'm retrieving data from a PostgreSQL database, that are encoded in what
PostgreSQL calls UNICODE, which, I assume, means UTF8. At least I'm seeing
two weird characters for the Norwegian letters I want to see if I do
nothing... :-)
I'm using the really great module Postscript::MailLabels to generate mail
labels on the basis of this data, but Postscript::MailLabels needs Latin1
input.
So, I need to translate from UTF8 to Latin1.
I've found many modules on CPAN to do this, but I can't get any of them to
do what I want... I think I've missed something conceptually important
(and I wish I hadn't gotten into this messy project where I get too little
time to sit down and learn things).
I've tried Unicode::MapUTF8, Unicode::Map8 and Unicode::Map, passing the
string, which is UTF8 encoded to the method that I thought world convert
them. Most of the time, the Norwegian characters just disappear, sometimes
the whole string disappears. But perhaps I should somehow declare that the
string _is_ UTF8 before trying to convert it...?
My latest attempt is to use Unicode::String, something like this:
use Unicode::String qw(utf8 utf16 latin1);
Unicode::String->stringify_as("utf8");
[snip lots of other stuff]
my $us = Unicode::String->new();
my $tmp = $us->utf8(${$data}{$kid}{'navn'});
$navn = $us->latin1(${$data}{$kid}{'navn'});
${$data}{$kid}{'navn'} is the string which contains the UTF8-coded data.
This apparently only removes the Norwegian characters. At least I can't
see them in any of my output.
What am I doing wrong.
I have also been experimenting with setting the PostgreSQL client to use
an encoding, e.g.:
$rv = $dbh->do("SET CLIENT_ENCODING TO 'LATIN1';");
This seems to result in 7 bit text, as e.g. ø is converted to x.
Funnily, I get the same result with
$rv = $dbh->do("SET CLIENT_ENCODING TO 'UNICODE';");
Even more strange is that if I do \encoding on the psql command line
(the output from this client in the terminal shows Norwegian letters
correctly), it says SQL_ASCII or something... Huh?
My box is a simple laptop running RH 8.0, with a 2.4.20 Linux kernel, my
Perl installation is therefore v5.8.0.
O, hackers of great wisdom, how do I do this correctly?
Yours Confusedly,
Kjetil
--
Kjetil Kjernsmo
Recent astrophysics graduate Problems worthy of attack
University of Oslo, Norway Prove their worth by hitting back
E-mail: kjetikj(_at_)astro(_dot_)uio(_dot_)no -
Piet Hein
Homepage <URL:http://folk.uio.no/kjetikj/>
Webmaster(_at_)skepsis(_dot_)no OpenPGP KeyID:
6A6A0BBC