namazu-users-en
[Top] [All Lists]

RE: huh ... another question

2001-07-03 02:17:29
pptHtml is installed. I properly installed xlHtml. When I run pptHtml in
command-line, I've no problem with it, it returns me an HTML output. But
look at this :

<snip>
[root(_at_)xxx /root]# /usr/bin/mknmz --exclude='/\.' -O /var/lib/namazu/index/
/path/to/files2index/
Looking for indexing files...
No files to index.
[root(_at_)xxx /root]# cp /path/to/files2index/somefile.ppt
/path/to/files2index/test.ppt
[root(_at_)xxx /root]# /usr/bin/mknmz --exclude='/\.' -O /var/lib/namazu/index/
/path/to/files2index/
Looking for indexing files...
No files to index.
[root(_at_)xxx /root]# cp /path/to/files2index/somefile.ppt
/path/to/files2index/test.doc
[root(_at_)xxx /root]# /usr/bin/mknmz --exclude='/\.' -O /var/lib/namazu/index/
/path/to/files2index/
Looking for indexing files...
1 files are found to be indexed.
wvError: (./wvparse.c:55) Not a word document
 wvError: (./wvWare.c:348) startup error
 1/1 - /path/to/files2index/test.doc [application/msword]
Writing index files...
[Append]
Date:                Tue Jul  3 10:57:40 2001
Added Documents:     1
Size (bytes):        1,826,816
Total Documents:     183
Added Keywords:      1
Total Keywords:      115,707
Wakati:              module_kakasi -ieuc -oeuc -w
Time (sec):          9
File/Sec:            0.11
System:              linux
Perl:                5.006
Namazu:              2.0.5

[root(_at_)bpodoc /root]# pptHtml 
pptHtml - Outputs Power Point files as Html.
Usage: pptHtml <FILE>
[root(_at_)bpodoc /root]# pptHtml /path/to/files2index/test.ppt | tail -n 3
<HR>&nbsp;<br>
<hr><FONT SIZE=-1>Created with <a href="http://www.xlhtml.org/";>pptHtml
0.2.8</a></FONT><br>
</BODY></HTML>
[root(_at_)bpodoc /root]# 
</snip>

As you can see:
- pptHtml is well-working
- mknmz doesn't try to index ppt files.

Here is a 'mknmz -C' output :

<snip>
[root(_at_)xxx /root]# mknmz -C
Loaded rcfile: /etc/namazu/mknmzrc
System: linux
Namazu: 2.0.5
Perl: 5.006
NKF: module_nkf
KAKASI: module_kakasi -ieuc -oeuc -w
ChaSen: no -j -F '%m '
Wakati: module_kakasi -ieuc -oeuc -w
Lang: en_US
Coding System: euc
CONFDIR: /etc/namazu
LIBDIR: /usr/share/namazu/pl
FILTERDIR: /usr/share/namazu/filter
TEMPLATEDIR: /usr/share/namazu/template
Supported media types: 
  application/excel
  application/msword
  application/pdf
  application/x-bzip2
  application/x-compress
  application/x-gzip
  message/news
  message/rfc822
  text/hnf
  text/html
  text/html; x-type=mhonarc
  text/plain
  text/plain; x-type=rfc
  text/x-roff
[root(_at_)xxx /root]# 
</snip>

as you can see, ppt is not supported ? why ? LIBS and FILTERS seem to be OK
:

<snip>
[root(_at_)xxx /root]# cd /usr/share/namazu/pl/
[root(_at_)xxx pl]# ll
total 76
-rw-r--r--    1 apache   apache       3460 May 31 11:28 codeconv.pl
-rw-r--r--    1 apache   apache       4980 May 31 11:28 conf.pl
-rw-r--r--    1 apache   apache       4032 May 31 11:28 gettext.pl
-rw-r--r--    1 apache   apache       4756 May 31 11:28 htmlsplit.pl
-rw-r--r--    1 apache   apache      15876 May 31 11:28 nmzidx.pl
-rw-r--r--    1 apache   apache       8648 May 31 11:28 seed.pl
-rw-r--r--    1 apache   apache       4560 May 31 11:28 usage.pl
-rw-r--r--    1 apache   apache       5437 May 31 11:28 util.pl
-rw-r--r--    1 apache   apache       3636 May 31 11:28 var.pl
-rw-r--r--    1 apache   apache       2992 May 31 11:28 wakati.pl
[root(_at_)xxx pl]# cd ../filter/
[root(_at_)xxx filter]# ll
total 96
-rw-r--r--    1 apache   apache       1875 May 31 11:28 bzip2.pl
-rw-r--r--    1 apache   apache       1897 May 31 11:28 compress.pl
-rw-r--r--    1 apache   apache       5034 May 31 11:28 excel.pl
-rw-r--r--    1 apache   apache       3161 May 31 11:28 gfilter.pl
-rw-r--r--    1 apache   apache       2968 May 31 11:28 gzip.pl
-rw-r--r--    1 apache   apache       8252 May 31 11:28 hnf.pl
-rw-r--r--    1 apache   apache       8927 May 31 11:28 html.pl
-rw-r--r--    1 apache   apache       9302 May 31 11:28 mailnews.pl
-rw-r--r--    1 apache   apache       4019 May 31 11:28 man.pl
-rw-r--r--    1 apache   apache       3318 May 31 11:28 mhonarc.pl
-rw-r--r--    1 apache   apache       5037 May 31 11:28 msword.pl
-rw-r--r--    1 apache   apache       2609 May 31 11:28 pdf.pl
-rw-r--r--    1 apache   apache       2365 May 31 11:28 powerpoint.pl
-rw-r--r--    1 apache   apache       3154 May 31 11:28 rfc.pl
-rw-r--r--    1 apache   apache       2348 May 31 11:28 taro.pl
-rw-r--r--    1 apache   apache       2711 May 31 11:28 tex.pl
[root(_at_)xxx filter]# 
</snip>

The fact that these files are owned by apache is, I think, not the problem,
if it was, mknmz wouldn't index anything !, not just powerpoint.

please help,

thanks,

Bastien.

-----Original Message-----
From: knok(_at_)daionet(_dot_)gr(_dot_)jp 
[SMTP:knok(_at_)daionet(_dot_)gr(_dot_)jp]
Sent: Monday, July 02, 2001 8:37 AM
To:   namazu-users-en(_at_)namazu(_dot_)org
Subject:      Re: huh ... another question

In article 
<4DB8126073BCD411B96700508BEF2E78221C8C(_at_)ISHATD0910(_dot_)gbank(_dot_)be>
bastien(_dot_)devos(_at_)bpo(_dot_)be writes:

as you can see, with 'mknmz -C', powerpoint is not listed. Is it normal
?
What can I do to solve that ?

You need to install pppHtml command if you want to do indexing
PowerPoint document.

pptHtml is included in xlHtml <http://www.xlhtml.org/>.
-- 
NOKUBI Takatsugu
E-mail: knok(_at_)daionet(_dot_)gr(_dot_)jp
      knok(_at_)namazu(_dot_)org / knok(_at_)debian(_dot_)org


<Prev in Thread] Current Thread [Next in Thread>