perl-unicode

[Encode] 8.3 rules sucks! check83.pl is obsolete!

2002-03-25 00:06:42
Encode Hackers,

I am now in the middle of linting filenames. I have found check83.pl terribly obsolete.

> perl ../perl/Porting/check83.pl

Yields nothing but filenames like "7bit-greek.enc" is an obvious violation. So I resorted to write my own version of check83.pl, chkmani.pl. I have included the script right after my sig. And here is the outcome.

> perl t/chkmani.pl
Encode/7bit-greek.enc : "7bit-greek.enc" is non-8.3-compliant.
Encode/7bit-kana.enc : "7bit-kana.enc" is non-8.3-compliant.
Encode/7bit-latin1.enc : "7bit-latin1.enc" is non-8.3-compliant.
Encode/big5-hkscs.enc : "big5-hkscs.enc" is non-8.3-compliant.
Encode/iso-ir-165.enc : "iso-ir-165.enc" is non-8.3-compliant.
Encode/macCentEuro.enc : "macCentEuro.enc" is non-8.3-compliant.
Encode/macCroatian.enc : "macCroatian.enc" is non-8.3-compliant.
Encode/macCyrillic.enc : "macCyrillic.enc" is non-8.3-compliant.
Encode/macDingbats.enc : "macDingbats.enc" is non-8.3-compliant.
Encode/macIceland.enc : "macIceland.enc" is non-8.3-compliant.
Encode/macRumanian.enc : "macRumanian.enc" is non-8.3-compliant.
Encode/macTurkish.enc : "macTurkish.enc" is non-8.3-compliant.
Encode/macUkraine.enc : "macUkraine.enc" is non-8.3-compliant.
encengine.c : "encengine.c" is non-8.3-compliant.
lib/Encode/Supported.pod : "Supported.pod" is non-8.3-compliant.
lib/Encode/iso10646_1.pm : "iso10646_1.pm" is non-8.3-compliant.
lib/Encode/EncFormat.pod : "EncFormat.pod" is non-8.3-compliant.

*.pm and *.pod are easy to fix but *.enc is tough because Encode::Tcl and compile faithfully generate canonical encoding name out of filenames. For the time being I will fix *.pod and *.pm but I am stoked on *.euc. Tell me what you guys think.

Dan the Man too Big to Fit in 8.3

#!/usr/bin/perl
# usage: perl chkmani.pl
require 5.00503;
use strict;
my %Manifest;
open MANIFEST, "MANIFEST" or die "MANIFEST:$!";
while(<MANIFEST>){
    chomp; s/\s+.*//o;
    # existence
    -f $_ or warn qq($_ : nonexistent\n);
    # case insensitivity
    if (exists $Manifest{my $lc = lc($_)}){
        warn qq($_ : conflicts with "$Manifest{$lc}"\n);
    }else{
        $Manifest{$lc} = $_;
    }
    my @dir  = split(m{/}, $_);
    my $file = pop @dir;
    # directory;
    for my $dir (@dir){
        $dir =~ /\./o and warn qq($_ : "$dir" contains a dot.\n);
    }
    # file
    my $badchar = '';
    while ($file =~ /\G([^A-Za-z0-9_\.-])/go){ $badchar .= $1 };
    length($badchar) and
        warn qq($_ : "$file" contains non-portable chars "$badchar");
    $file =~ /^([A-Za-z0-9_-]{1,8}(\.[A-Za-z0-9_-]{1,3})?)$/o;
    $file ne $1 and
        warn qq($_ : "$file" is non-8.3-compliant.\n);
}
__END__