Hi peoples:
Summary of problem
------------------
I have a Fedora Core 3 box running namazu 2.0.14. For some reason, namazu
apparently ignores the MaxHit directive in the .namazurc file. I do not know
what the max setting is, but I know searches that return a small amount of
entries Namaz
Example
-------
Here is an example case. Please note that hostnames and directory pathes have
been changed, though not to the detriment of this report.
On the web interface, I run a search for the word "test". I get a result of:
Web interface results
---------------------
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
Up the top, I have this:
This index contains 5,072 documents and 122,292 keywords.
That would imply to me that there are a possible 5072 results. My .namazurc
file is attached below in complete. However, here is the relevant sections:
MaxHit 5000
MaxMatch 200000
So by that, I should have got some results.
Command line results
--------------------
This is what happens when I run namazu from the command line:
#=====================================================================#
[root(_at_)server cgi-bin]# namazu -n 2147483647 -f .namazurc test
Results:
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
[root(_at_)server cgi-bin]#
#=====================================================================#
Notice how I have specified an insane value for -n. The .namazurc file I
specified is the one below. Surely there cannot be more than 2147483647 matches
in a database that contains 5,072 documents and 122,292 keywords?
Here is the debug output:
#=======================================================================================================#
[root(_at_)server cgi-bin]# namazu -n 2147483647 -f .namazurc test --debug
namazu(debug): NAMAZUNORC: ''
namazu(debug): 15: Directive: [Index]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug): 23: Directive: [Template]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/lists]
namazu(debug): 51: Directive: [Replace]
namazu(debug): Argument 1: [/home/wwwroot/sites/lists.company.com/]
namazu(debug): Argument 2: [http://lists.company.com/]
namazu(debug): 58: Directive: [Logging]
namazu(debug): Argument 1: [off]
namazu(debug): Scoring: tfidf: 0, dl: 0, freshness: 0, uri: 0
namazu(debug): 78: Directive: [Scoring]
namazu(debug): Argument 1: [simple]
namazu(debug): 85: Directive: [EmphasisTags]
namazu(debug): Argument 1: [<strong class="keyword">]
namazu(debug): Argument 2: [</strong>]
namazu(debug): 92: Directive: [MaxHit]
namazu(debug): Argument 1: [5000]
namazu(debug): 99: Directive: [MaxMatch]
namazu(debug): Argument 1: [200000]
namazu(debug): load_rcfile: .namazurc loaded
namazu(debug): -n: 2147483647
namazu(debug): -w: 0
namazu(debug): query: [test]
namazu(debug): Index name [0]: /home/wwwroot/sites/lists.company.com/lists
namazu(debug): set_phrase_trick: test
namazu(debug): set_regex_trick: test
namazu(debug): query.tokennum: 1
namazu(debug): query.tab[0]: test
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.t:
8234976
namazu(debug): before nmz_strlower: [test]
namazu(debug): after nmz_strlower: [test]
namazu(debug): do WORD search
namazu(debug): size of /home/wwwroot/sites/lists.company.com/lists/NMZ.ii:
489168
namazu(debug): l:0: »¾
namazu(debug): r:122291: ãããããchmail2000(_at_)vip(_dot_)sina(_dot_)com
namazu(debug): searching: bodipy?
namazu(debug): searching: onwards,
namazu(debug): searching: technician's
namazu(debug): searching: unconfigured
namazu(debug): searching: ticket274.html
namazu(debug): searching: ticket1377
namazu(debug): searching: threadm
namazu(debug): searching: testing2
namazu(debug): searching: tender--
namazu(debug): searching: test3
namazu(debug): searching: terrible)
namazu(debug): searching: test()
namazu(debug): searching: tesoriero"
namazu(debug): searching: tesoriero</st1:personname>';
namazu(debug): searching: test"
namazu(debug): searching: test
Results:
References: [ (Too many documents hit. Ignored) ]
No document matching your query.
[root(_at_)server cgi-bin]#
#=======================================================================================================#
If you would like, I could attach straces of the searchs above. I have looked
through them and have not seen anything in there that gives me any hints.
#=======================================================================================================#
#=======================================================================================================#
# This is a Namazu configuration file for namazu or namazu.cgi.
#
# Originally, this file is named 'namazurc-sample'. so you should
# copy this to 'namazurc' to make the file effective.
#
# Each item is must be separated by one or more SPACE or TAB characters.
# You can use a double-quoted string for represanting a string which
# contains SPACE or TAB characters like "foo bar baz".
##
## Index: Specify the default directory.
##
# Index /usr/local/var/namazu/index
Index /home/wwwroot/sites/lists.company.com/lists
##
## Template: Set the template directory containing
## NMZ.{head,foot,body,tips,result} files.
##
# Template /home/www/corbett/www/lists
Template /home/wwwroot/sites/lists.company.com/lists
##
## Replace: Replace TARGET with REPLACEMENT in URIs in search
## results.
##
## TARGET is specified by Ruby's perl-like regular expressions.
## You can caputure sub-strings in TARGET by surrounding them
## with `(' and `)'and use them later as backreferences by
## \1, \2, \3,... \9.
##
## To use meta characters literally such as `*', `+', `?', `|',
## `[', `]', `{', `}', `(', `)', escape them with `\'.
##
## e.g.,
##
## Replace /home/foo/public_html/ http://www.foobar.jp/~foo/
## Replace /home/(.*)/public_html/ http://www.foobar.jp/\1/
## Replace /C\|/foo/ http://www.foobar.jp/
##
## If you do not want to do the processing on command line use,
## run namazu with -U option.
##
## You can specify more than one Replace rules but the only
## first-matched rule are applied.
##
#Replace /home/foo/public_html/ http://www.foo.bar.jp/~foo/
Replace /home/wwwroot/sites/lists.company.com/
http://lists.company.com/
##
## Logging: Set OFF to turn off keyword logging to NMZ.slog.
## Default is ON.
##
Logging off
##
## Lang: Set the locale code such as `ja_JP.eucJP', `ja_JP.SJIS',
## `de', etc. This directive works only if the environment
## variable LANG is not set because the directive is mainly
## intended for CGI use. On the shell, You can set
## environemtnt variable LANG instead of using the directive.
##
## If you set `de' to it, namazu.cgi use
## NMZ.(head|foot|body|tips|results).de for displaying results
## and use a proper message catalog for `de'.
##
#Lang ja
##
## Scoring: Set the scoring method "tfidf" or "simple".
##
Scoring simple
##
## EmphasisTags: Set the pair of html elements which is used in
## keyword emphasizing for search results.
##
EmphasisTags "<strong class=\"keyword\">" "</strong>"
##
## MaxHit: Set the maximum number of documents which can be
## handled in query operation. If documents matching a
## query exceed the value, they will be ignored.
##
MaxHit 5000
##
## MaxMatch: Set the maximum number of words which can be
## handled in regex/prefix/inside/suffix query. If documents
## matching a query exceed the value, they will be ignored.
##
MaxMatch 200000
##
## ContentType: Set "Content-Type" header output. If you want to
## use non-HTML template files, set it suitably.
#ContentType "text/x-hdml"
#=======================================================================================================#
#=======================================================================================================#
Anthony Sadler
Far Edge Technology
w: (02) 8425 1410
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en