perl-unicode

Help with a regex

2002-10-07 08:30:04
Hello,

I have a file that was saved in utf-16 which got converted to
non-unicode and lost several unicode characters in the process
(en-space, thin space, etc). I am now working with a previous version of
this file which is still in utf-16, and I need to search it for all of
the characters which would have been mangled by saving in the
non-unicode format.

I'm pretty sure the regex sort of look like:

 if($line =~ /\x{0x00FF}-\x{0xFFFF}/) {
   # do stuff
 }

But I don't know enough about the hex representation of Unicode to know
what exactly the regex should be.

Thanks in advance,

 -dave g

<Prev in Thread] Current Thread [Next in Thread>
  • Help with a regex, David Gray <=