perl-unicode

detecting and opening unicode files?

1999-08-04 13:53:46
Hello,
I'm not sure if this is the right forum, but I haven't been able to
figure out any other place to ask this question so here I go:

I have a perl script running on a Windows NT 4.0 system which calculates
file checksums in a directory tree but it turns out that some of the
files and directories in the tree are unicode files (e.g. Japanese) so
when my script encounters these files/directories it can't figure out
what to do with them. I've been searching the web but I'm not sure how
to open a unicode file (I don't care about the contents since I'm just
doing a byte checksum). My script is shown below.

Any help would be greatly appreciated!

Thank you,

Tom Shou

--------------------------------------------------------------------------
$level = 0;
$opt_d = '.';
$globalsum = 0;
$openchar = "+<";
$writechar = "+>";
$count = 0;

&dodir($opt_d, $level);

sub dodir {
        local($dir, $level) = @_;

        chdir $dir;
        
        opendir(DIR, '.') || die "Can't open $dir";
        local(@filenames) = readdir(DIR);
        closedir(DIR);

        foreach $filename (@filenames) {
                next if $filename eq '.';
                next if $filename eq '..';
                        
                stat($filename);
#
# When I encounter a unicode file, stat returns nothing so the code
drops
# down to the open call and can't open the file. The file name is
unicode
# so it displays as $$$$$$... on my NT system.
#
                if (-d _) {
                        chdir $filename || die "Can't cd to $filename";
                        $level++;
                        &dodir($filename, $level);
                        chdir '..';
                        $level--;
                }
                else {
#
# Would something like this work?:
#
                  if (is_unicode()) { # what should is_unicode() look like?
                        # check if unicode file or direcotry
                        #
                        # 
                        #
                        # If it's a unicode directory or file then do the
                        # checksum calculation...
                        #
                   }
                   { # is a normal file

                        if (! open(MYFILE, "$openCh$filename") ) {
                                print "Can't open $dir\\$filename,
level=$level\n"; 
                                next;
                        }
                        undef $/;
                        $csum = unpack ("%32C*", <MYFILE>) & 32767;
                        close(MYFILE);
                        $globalsum += $csum;
                   }

                }
        }
}

done:
print "num files =$count, sum of checksums= $globalsum\n";

-----------------------------------------------------------------------------


-- 
____________________________________________________________________
Tom Shou | shou(_at_)engr(_dot_)sgi(_dot_)com | SGI | 650.933.5362 | 
650.932.0687 fax

http://reality.sgi.com/shou_engr

<Prev in Thread] Current Thread [Next in Thread>