Re: Variation In Decoding Between Encode and XML::LibXML

perl-unicode

[Top] [All Lists]

Re: Variation In Decoding Between Encode and XML::LibXML

2010-06-18 15:41:17

from [David E. Wheeler]

[Permanent Link]

On Jun 18, 2010, at 12:05 AM, John Delacour wrote:

In this case all talk of iso-8859-1 and cp1252 is a red herring.  I read 
several Italian websites where this same problem is manifest in external 
material such as ads.  The news page proper is encoded properly and declared 
as utf-8 but I imagine the web designers have reckoned that the stuff they 
receive from the advertisers is most likely to be received as windows-1252 
and convert accordingly rather than bother to verify the encoding.  As a 
result material that is received as utf-8 will undergo a superfluous encoding.

Here's a way to get the file in question properly encoded:


Yep, that works for me, too. I guess XML::LibXML isn't using Encode in the same 
way to decode content, as it returns the string with the characters as 
\x{c4}\x{8d}.

Thanks for the help, everyone. I've got my code parsing all my feeds and 
emitting a valid UTF-8 feed of its own now.

Best,

David

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
Re: Variation In Decoding Between Encode and XML::LibXML, (continued) Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler Re: Variation In Decoding Between Encode and XML::LibXML, Marvin Humphrey Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler Re: Variation In Decoding Between Encode and XML::LibXML, Marvin Humphrey Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler RE: Variation In Decoding Between Encode and XML::LibXML, Henning Michael Møller Just Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler Re: Variation In Decoding Between Encode and XML::LibXML, John Delacour Re: Variation In Decoding Between Encode and XML::LibXML, John Delacour Re: Variation In Decoding Between Encode and XML::LibXML, John Delacour Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler <= Re: Variation In Decoding Between Encode and XML::LibXML, Michael Ludwig Re: Variation In Decoding Between Encode and XML::LibXML, Daisuke Maki Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler Re: Variation In Decoding Between Encode and XML::LibXML, John Delacour Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler

Previous by Date:	Re: Variation In Decoding Between Encode and XML::LibXML, David E. Wheeler
Next by Date:	Re: Variation In Decoding Between Encode and XML::LibXML, Michael Ludwig
Previous by Thread:	Re: Variation In Decoding Between Encode and XML::LibXML, John Delacour
Next by Thread:	Re: Variation In Decoding Between Encode and XML::LibXML, Michael Ludwig
Indexes:	[Date] [Thread] [Top] [All Lists]