perl-unicode

Re: UTF8 behavior under -T (Taint) mode

2004-01-01 06:30:04
Thanks for replying, Dan-san.

At 18:09 04/01/01 +0900, Dan wrote:
It seems that utf8::decode() does not work for
any tainted variables under the -T (Taint) mode.

What drove you to such a conclusion?  It does work.  Try something like

  perl -T -le 'utf8::decode($ARGV[0])' something

and see it for yourself.  Did perl die with "Insecure ..." message?

Sorry, no. Since the case which I would like to suggest
seems not to be fatal. Perl would not die, but it would
take the tainted value as a Non-UTF8 string.

My sample code is like below (test.pl):
-------------------------------------------------
utf8::decode(my $text0 = "\x{3042}"  ); # clean
utf8::decode(my $arg   = $ARGV[0]    ); # tainted
utf8::decode(my $text1 = "$arg$text0"); # tainted
utf8::decode(my $text2 = "$text0$arg"); # tainted

print length($text1), "\n";
print length($text2), "\n";
-------------------------------------------------

When I run this code with 'perl -T test.pl a', the result is:

4
2

and when I run this code with 'perl test.pl a', the result is:

2
2

So I guess $text1 did not treated as a UTF8 string under
the taint mode.

(My system is perl5.8.1 MSWin32-X86-multi-thread)

I would like to know any reasons for this problem.

Attachment: test.pl
Description: Binary data

-- 
Masanori HATA
<lovewing(_at_)dream(_dot_)big(_dot_)or(_dot_)jp>
He's always with us!
<Prev in Thread] Current Thread [Next in Thread>