perl-unicode

Unicode filenames on Windows with Perl >= 5.8.2

2004-06-18 01:30:08
Hi,

I'm trying to figure out if I can handle Unicode filenames on Windows 
using Perl 5.8.4, and if so, how.

I've read a couple of other threads on this list about this issue, but 
as far as I can see no firm conclusion has yet been reached on what 
might be done with regard to putting back (in a saner way) the 
functionality that was provided by the -C option prior to 5.8.1.

I am, of course, interested to hear what the latest thinking is if any 
thinking has been done about this issue recently, but what I really want 
to know is what, if anything, can I do /now/, using Perl 5.8.4.

I'm running on Windows XP (English language setup), and I have a 
directory full of files with all sorts of non-cp1252 characters in their 
names.  Windows Explorer displays them all very nicely (see input.png 
attached).

But when I use readdir() to list them, I find that each of the 
non-cp1252 characters get replaced with a "?", so then, of course, I 
can't do anything with the filenames returned (like open them).  The 
attached output.png is the output of the following program (run in a 
Command Prompt with the code page changed from cp850 to cp1252):

opendir my $dh, '.' or die "Can't read cwd: $!";
my @filenames = grep { /^\d\d / } readdir $dh;
closedir $dh;
foreach my $filename (@filenames) {
  printf "[%6s] %s\n", (-f $filename ? 'OK' : 'NOT OK'), $filename;
}

So my question is: How can I deal with these files?

I've tried using Perl scalars containing UTF-8, UTF-16LE and UTF-16BE 
encodings of the filenames, but none of them work either.  Indeed, if I 
try to write a new file with a name constructed in those ways, then the 
name of the file actually created is simply the sequence of bytes that 
make up those encodings.

- Steve


------------------------------------------------
Radan Computational Ltd.

The information contained in this message and any files transmitted with it are 
confidential and intended for the addressee(s) only.  If you have received this 
message in error or there are any problems, please notify the sender 
immediately.  The unauthorized use, disclosure, copying or alteration of this 
message is strictly forbidden.  Note that any views or opinions presented in 
this email are solely those of the author and do not necessarily represent 
those of Radan Computational Ltd.  The recipient(s) of this message should 
check it and any attached files for viruses: Radan Computational will accept no 
liability for any damage caused by any virus transmitted by this email.

PNG image

PNG image