perl-unicode

CGI and UTF

2002-11-20 09:30:05
I'm having some problems with XML/UTF8 and CGI variables in perl5.6.1
 
I have attached an example of the problem, an example string is
Descripción - although you will need to have XML::Simple installed. 
 
The example takes an input string and then prints it twice - one with
concatenation another just displaying the inputted string. The mangling
occurs when you concatenate an XML string with a CGI string.
 
I'm not sure why this happens but here is a first attempt at a possible
theory. All XML parsing is done in UTF8, but perl has no idea of
encodings for incomding CGI streams and assumes them to be iso-88591
(latin1) - I read this somewhere don't know if its correct. String
operations upgrade none UTF8 strings to UTF8, so perl tries to convert
the CGI string from iso-88591 to UTF8 thus mangling it as its already
UTF8.
 
Can any point me in the right direction, explain where I'm going wrong
and maybe provide some usefull links - there seems to be very little
information on building internationalised web pages with UTF8 and
perl5.6.1.
 
Thanks
 
Mark
 

Attachment: testUTF8.pl
Description: Binary data

<Prev in Thread] Current Thread [Next in Thread>