perl-unicode

Re: Problem processing UTF-8 strings from email

2008-01-12 17:42:18
On 1/12/08, Neil Gunton <neil(_at_)nilspace(_dot_)com> wrote:

I am somewhat experienced with Perl in general, but absolutely no
experience dealing with UTF-8. I have a community journals website which
allows updates from users via email. I'm having trouble with emails that
contain Chinese characters encoded (I think) as UTF-8. The strings look
like this:

=?UTF-8?B?5qGQ5LmhLCBUb25neGlhbmc6IEJlaW5nIGEgJ2hhbg==?= =?UTF-8?B?dHUn?=

When I read this text from a file, using my perl script, and then save
it into MySQL, it comes out on the website looking literally like the
above. I can't seem to get perl to "do" anything with it in terms of
conversions to a format that looks like chinese characters when
displayed on the Web.

  use Encode;
  use Encode::MIME::Header;
  decode("MIME-Header", $bytes);

to get the Unicode strings for these MIME encoded characters.

Does anybody have any clues as to how to convert strings like this into
something more usable - e.g. HTML character entities?

If you want to turn them into HTML entities, you can say:

  encode("ascii", decode("MIME-Header", $bytes), Encode::FB_HTMLCREF);

HTH

-- 
Tatsuhiko Miyagawa

<Prev in Thread] Current Thread [Next in Thread>