perl-unicode

Re: PS (Malformed UTF-8 character)

2003-11-03 15:30:05
Thanks to all those who replied to my query about finding out the encoding of a text that *should* have been in 7bit ASCII.

In the end, I dealt with the problem in the following way: I captured the STDERR output of the script that gave the Malformed UTF-8 character warnings and I processed it with a regexp to extract the lines where the warnings were sent. Then, a colleague and I fixed all those lines with a mix of hexdumping, text editing, deciding the right symbol on the basis of the context, and simple scripts.

Regards,

Marco


---
Marco Baroni
University of Bologna
http://sslmit.unibo.it/~baroni

<Prev in Thread] Current Thread [Next in Thread>
  • Re: PS (Malformed UTF-8 character), Marco Baroni <=