perl-unicode

RE: Perl and unicode file names

2005-02-24 09:15:45
On Thu, 24 Feb 2005, Ed Batutis wrote:
So the problem I have is how to proceed. Should I give up with
Perl and use Java or C? Any suggestions gratefully received.


I started a really 'fun' flame war on this topic several months ago,
so I hesitate to say anything more. But, yes, you should give up on
Perl - or run your script on Linux with a utf-8 locale. On Win32, Perl
internals are converting the filename characters to the system default
code page. So, you are SOL for what you are trying to do.

Actually, you *can* work around the problems on Windows by using the
Win32API::File and the Encode module.  Here is a sample program
Gisle came up with:

#!perl -w

use strict;
use Fcntl qw(O_RDONLY);

use Win32API::File qw(CreateFileW OsFHandleOpenFd :FILE_ OPEN_EXISTING);
use Encode qw(encode);

binmode(STDOUT, ":utf8");

my $h = CreateFileW(encode("UTF-16LE", "\x{2030}.txt\0"), FILE_READ_DATA,
                           0, [], OPEN_EXISTING, 0, []);

my $fd = OsFHandleOpenFd($h, O_RDONLY);
die if $fd < 0;
open(my $fh, "<&=$fd");
binmode($fh, ":encoding(UTF-16LE)");
while (<$fh>) {
    print $_;
}
close($fh) || die;
__END__

It may be possible to do similar readdir() emulation as well.

Win32::APIFile is part of libwin32 and already included in ActivePerl.

Cheers,
-Jan



<Prev in Thread] Current Thread [Next in Thread>