perl-unicode

Converting string to UTF-16LE

2004-02-25 11:30:06
Hello,

i use a perl script to search different files. The search values are given
from a HTML page, the results are displayed on this page, too. The files are
saved in the UTF16LE format, therefore i will open them with the following
open command:

    open(F, "<:raw:encoding(UTF-16LE)", $file) || die "Cannot read $file:
$!\n";

This works fine and the data is readed correctly after opening.

The search value is specified in the HTML page, the URL with the value will
look like the following:

    http://10.0.5.62/search.pl?value=73,98,97,241,101,122

The numbers are the charcodes of the search value and will be formed back to
a string var in the perl script:

    sub decodeString {
        my $sInput = shift;
        my $sOutput = "";
        my @arrChars = split(/,/, $sInput);
        foreach ( @arrChars )
        {
            $iCharCode = ($_)*1;
            $sOutput .= chr($iCharCode);
        }
        return $sOutput;
    }

For this example the search value will be "Iba\xF1ez". Because of the search
isn't case-sensitive, all letters should be uppercased, using the uc method.
But uc will return different strings for the search value and for the line
read from the UTF16-LE file:

    $sValue = uc($sValue);        # $sValue is IBA\xF1EZ after uc
    $sLine = uc($sLine);            # $sLine is IBA\xD1EZ after uc

So the search will not find the search value find although it should do so!

I played a lot with the decode and encode method, but with no success.
Either the return string isn't valid or the uc method's result is the same.

Can anybody tell me how to work with UTF8 and UTF16 in the same script? Any
help would be greatly appreciated.

Thanks in advance,

Sebastian


<Prev in Thread] Current Thread [Next in Thread>