perl-unicode

Re: How to convert base64 string to utf-8

2004-02-02 13:30:04

At 5:14 pm +0200 2/2/04, ALexander N. Treyner wrote:

Hello All,
I'm using utf-8 Postgres database, where I save strings in many languages.
I have to match the database with strings encoded in mime base64 or quoted-printable format. Like next:
=?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
or
=?KOI8-R?Q?=F0=D2=C9=D7=C5=D4=2C_=ED=C9=D2!!!?=

I think that I need first convert these strings to utf-8, but I can not find out how to do it.


The script below will do it in the two cases you mention, though I think you would need to elaborate the regular expression -- I've taken it to the point where it copes with just your examples. In this case both 'utf-8' and 'KOI8-R' are accepted by Encode rather than the default (and wrong) 'utf8' and 'koi8-r', so I think a reading of the perldoc will reveal that dashes and case are properly interpreted in most cases.



use Encode;
use MIME::Base64;
use MIME::QuotedPrint;
my $string;
$_ = <<_;
=?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
_
/=\?(.+)\?([BQ])\?(.+)\?=/;
my ( $charset, $encoding, $_7bit ) = ( $1, $2, $3 );

if ( $encoding eq 'B' ) { $string = decode_base64 $_7bit }
if ( $encoding eq 'Q' ) { $string = decode_qp $_7bit }

Encode::from_to( $string, $charset, "utf8" ) or die $!;
print $string;