On June 11, 2005 at 13:11, Tadamasa Teranishi wrote:
Perhaps, NMZ.field.subject.i is broken.
What is the version of Namazu used?
2.0.14.
Also, the "Malformed UTF-8 ..." warnings are popping up, regardles
of what LANG or LC_ALL are set to. I had to add a 'use bytes' pragma
to mailnews.pl at line 212 to get rid of the warnings.
Please try.
$ env LC_ALL=C mknmz ...
I have, in a myriad of ways. I just recreated things on one of my
local systems to make analysis easier.
I've made available of the command used and the output of a
stock namazu 2.0.14 installation available for your examination at
<http://www.mhonarc.org/tmp/mknmz-out.txt.gz>. I.e. No modifications
to namazu code is done, so the many "malformed utf-8 ..." messages
are provided. Perl also complains about wide characters in print.
I've also made available the input files and NMZ.* files at
the following locations:
<http://www.mhonarc.org/tmp/namazu-users-en_NMZ_files.tar.gz>
<http://www.mhonarc.org/tmp/namazu-users-en_input_files.tar.gz>.
The following is version information from mknmz:
mknmz -C
Namazu: 2.0.14
Perl: 5.008006
File-MMagic: 1.20
NKF: /usr/bin/nkf
KAKASI: no
ChaSen: no
Lang_Msg: C
Lang: C
Coding System: euc
CONFDIR: /usr/local/etc/namazu
LIBDIR: /usr/local/share/namazu/pl
FILTERDIR: /usr/local/share/namazu/filter
TEMPLATEDIR: /usr/local/share/namazu/template
Supported media types: (23)
Unsupported media types: (10) marked with minus (-) probably missing application
in your $path.
- application/excel: excel.pl
application/ichitaro5: taro56.pl
application/ichitaro6: taro56.pl
- application/ichitaro7: taro7_10.pl
application/macbinary: macbinary.pl
application/msword: msword.pl
- application/pdf: pdf.pl
application/postscript: postscript.pl
- application/powerpoint: powerpoint.pl
- application/rtf: rtf.pl
application/vnd.sun.xml.calc: ooo.pl
application/vnd.sun.xml.draw: ooo.pl
application/vnd.sun.xml.impress: ooo.pl
application/vnd.sun.xml.writer: ooo.pl
application/x-apache-cache: apachecache.pl
application/x-bzip2: bzip2.pl
application/x-compress: compress.pl
- application/x-deb: deb.pl
- application/x-dvi: dvi.pl
application/x-gzip: gzip.pl
- application/x-js-taro: taro7_10.pl
application/x-rpm: rpm.pl
- application/x-tex: tex.pl
- audio/mpeg: mp3.pl
message/news: mailnews.pl
message/rfc822: mailnews.pl
text/hnf: hnf.pl
text/html: html.pl
text/html; x-type=mhonarc: mhonarc.pl
text/plain
text/plain; x-type=rfc: rfc.pl
text/x-hdml: hdml.pl
text/x-roff: man.pl
The following is the output of doing a search via `namazu' from the
command-line:
namazu -s -n 3 -f cgi-bin/.namazurc '+from:earl' \
~/archive/html/namazu-users-en
Results:
References: [ +from:earl: 49 ]
Total 49 documents matching your query.
1. er things I want hidden (score: 1)
/~listsarc/archive/html/namazu-users-en/2004-09/msg00005.html (8,178 bytes)
2. g indexing (score: 1)
/~listsarc/archive/html/namazu-users-en/2004-05/msg00011.html (7,732 bytes)
3. med UTF-8 character ... (score: 1)
/~listsarc/archive/html/namazu-users-en/2004-05/msg00004.html (8,738 bytes)
Current List: 1 - 3
Notice how the first part of the subject strings are clipped. Doing
a search for "PHP" provides no hits, which is should.
If you require any other information, I will provide it.
Thanks for your help,
--ewh
--
Earl Hood, <earl(_at_)earlhood(_dot_)com>
Web: <http://www.earlhood.com/>
PGP Public Key: <http://www.earlhood.com/gpgpubkey.txt>
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en