hi
Tadamasa Teranishi wrote:
Program nmzchkw.pl that easily checked NMZ.w and the NMZ.wi file was
made.
Please move to the directory with the index and execute it.
$ cd indexdir
$ perl nmzchkw.pl
thanks for the checker.
If it is "All check passed.", it is shown that there are no problems
in NMZ.w and NMZ.wi.
unfortunately the check fails for the said list.
it is rather like this:
<snip>
==============================
check 1
==============================
nul : 0
control : 1
cr : 0
0x80 - 0xff : 2573
ok
==============================
check 2
==============================
lf : 374172
NMZ.w: words : 374172
NMZ.wi: words : 374172
ok
==============================
check 3
==============================
152027: 1506819 1506800
152028: 1506871 1506852
[...]
374169: 4514255 4514197
374170: 4514308 4514250
374171: 4514361 4514303
fail !!
==============================
1 check failed.
</snip>
(i have omitted 210000 lines...)
an inspectation of line 152027 in NMZ.w reveals the following
@œÞ'Ã~HwÃ~O±ïÃ~Zu€iéÃ~QŒş¹Ã~QmÃ| Ã~]sÚ+Ãœ5e")iÃ~P
this line is repeated 4 times.
there are several other occurences of this phenomenon: keywords which i
cannot read are repeated 4 times; i am pretty sure that these are the
cause of my troubles.
naturally i have no ideas where these weird keywords comes from (so i
cannot eliminate it in the archive itself; but this would only delay the
problem until a similar word would re-appear on the list.
i'll try again with the patch from your previous posting applied.
mfg.asdr.
IOhannes
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en