perl-unicode

Unicode characters

2009-05-22 12:56:38
Hi Perl Gurus,

I am using functions decode_entities() & decode_utf8() to decode the html
codes and UTF (latin characters) respectively. (from module use Encode).
The functions which i mentioned above works upto ASCII Decimals 255 and
above that it works differently.
This is the URL i referred to know the list of html codes and latin
characters [http://www.ascii.cl/htmlcodes.htm].

Attached the sample script.
 Where i give the input values which i got from a XML SOAP response for
decoding (The SOAP response doesn't gives the HTML numbers or HTML codes as
in the above said URL list).

The script gives me what i expected for array values from arr_val[0] to
arr_val[4] ((i.e) upto ASCII Decimals range 0-255)
but for arr_val[5] (which have ASCII Decimals greater than 255) the decoded
values are different.

Given the list of array variable values and their expected values. The
decoding fails for array variable arr_val[5].
Similarly i would need to encode also.

$arr_val[0] = '!"#$%&'()*+,-./   0123456789:;<=>?' ;
             expected decoded values -- !"#$%&'()*+,-./ 0123456789:;<=>?

$arr_val[1] =
'@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~' ;
             expected decoded values --
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

$arr_val[2] =
'Ã~(_at_)Ã~AÃ~BÃ~CÃ~DÃ~EÃ~FÃ~GÃ~HÃ~IÃ~JÃ~KÃ~LÃ~MÃ~NÃ~OÃ~PÃ~QÃ~RÃ~SÃ~TÃ~UÃ~VÃ~WÃ~XÃ~YÃ~ZÃ~[Ã~\Ã~]Ã~^Ã~_'
;
             expected decoded values -- ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß

$arr_val[3] = 'Ã|
áâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ' ;
             expected decoded values -- àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ

$arr_val[4] =
'¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿' ;
             expected decoded values -- ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿

$arr_val[5] = 'others Å~RÅ~SÅ|
šŸÆ~Râ~(_at_)~Sâ~@~Tâ~(_at_)~Xâ~@~Yâ~(_at_)~Zâ~@~\â~(_at_)~]â~@~^â~@| 
â~(_at_)¡â~@¢â~(_at_)¦' ;
             expected decoded values -- others ŒœŠšŸƒ–—‘’‚“”„†‡•…‰€™

  Could you please help to know what i am missing or doing wrong.
I'll greatly appreciate the help.

Thanks
Saravanan Balaji.

Attachment: spl_char.txt
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>