perl-unicode

Warnings on illegal UTF8

1998-10-07 06:25:52
With 5.005_52 plus Sarathy's must-apply patch, I get

  % ./perl -Ilib -wle '
  use utf8;                  
  while (my $s = shift @ARGV){
   print "s[$s]";           
   print length $s;
  }
  ' L\xFCbeckerstra\xDFe
  s[L\xFCbeckerstra\xDFe]
  8

(if some mail handling mechanism kills the 8th bit, my @ARGV is one
Latin-1 word, namely "Luebeckerstrasse" spelt properly in German)

This looks like two bugs to me: no warning about bad UTF-8 and a wrong
computation of the length of the string.

In general I'd like to ask: what's considered the politically correct
way to check if a string contains legal UTF8?

Thanks,
andreas

<Prev in Thread] Current Thread [Next in Thread>