At 5:14 pm +0200 2/2/04, ALexander N. Treyner wrote:
Hello All,
I'm using utf-8 Postgres database, where I save strings in many languages.
I have to match the database with strings encoded in mime base64 or
quoted-printable format. Like next:
=?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
or
=?KOI8-R?Q?=F0=D2=C9=D7=C5=D4=2C_=ED=C9=D2!!!?=
I think that I need first convert these strings to utf-8, but I can
not find out how to do it.
The script below will do it in the two cases you mention, though I
think you would need to elaborate the regular expression -- I've
taken it to the point where it copes with just your examples. In
this case both 'utf-8' and 'KOI8-R' are accepted by Encode rather
than the default (and wrong) 'utf8' and 'koi8-r', so I think a
reading of the perldoc will reveal that dashes and case are properly
interpreted in most cases.
use Encode;
use MIME::Base64;
use MIME::QuotedPrint;
my $string;
$_ = <<_;
=?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
_
/=\?(.+)\?([BQ])\?(.+)\?=/;
my ( $charset, $encoding, $_7bit ) = ( $1, $2, $3 );
if ( $encoding eq 'B' ) { $string = decode_base64 $_7bit }
if ( $encoding eq 'Q' ) { $string = decode_qp $_7bit }
Encode::from_to( $string, $charset, "utf8" ) or die $!;
print $string;