perl-unicode

Re: Website encoding

2004-11-19 02:30:08
Rick Measham <rickm(_at_)3d3(_dot_)com> writes:
That being the case, I grab the charset and use Encode's decode function
to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
right? 

As it happens the answer is "maybe", but it is the _internal_ form it is none 
of your 
business ;-) - so pretend you know nothing about how it all works 
and convert internal form to UTF-8 explicitly.
(But this will be efficent if string is internally in that form :-))

I then store that in the db.

When you get it back from db you need to convert it from UTF-8 
to perl's internal form. Again this is trivial.


However it's not working.

Does that mean that the encoding of the actual characters on the page is
not in the charset in the meta tag? 

Quite possibly - do you mean the chars in the headers or the body?

Or am I missing some piece of the
puzzle?

A random example page would be 
http://www.reitsport-schill.de/index1053542873.html

This page is in German and *says* the charset it ISO-8859-1. However the
characters with the umlauts are displaying as unknown chars in a page
tagged as utf8.


<Prev in Thread] Current Thread [Next in Thread>