perl-unicode

Re: Does LWP know anything (or need to know anything) about Unicode?

2004-10-11 02:30:06
Rick Measham <rickm(_at_)3d3(_dot_)com> writes:
G'day Unicode Gurus and other assorted members of the perl Unicode
community.

I have a script that attempts to collect translations from Babelfish.
I've posted it below.

It uses LWP::Useragent to turn an English phrase into Japanese (or any
other language supported by BabelFish)*

However, once I get the translation out of the page it appears to be
full of null bytes. I've tried various things like Unicode::String or
Encode, but to no avail. 

LWP I believe just ships octets about.

But it should have a mechanism to tell you the meta-data that 
HTTP marked those octets with - in this case there should
be something like a content-transfer-encoding header that
tells _you_ what name to feed to Encode to get bytes as Unicode.
You then have to decide how you are going to present the resulting 
characters in the HTML you are generating. You probably want 
to re-encode as UTF-8 if presenting mixed languages.


<Prev in Thread] Current Thread [Next in Thread>