[Top] [All Lists]

[Namazu-users-en] MaxHit directive apparently being ignored

2006-07-28 00:52:24
Hi peoples:

Summary of problem
I have a Fedora Core 3 box running namazu 2.0.14. For some reason, namazu 
apparently ignores the MaxHit directive in the .namazurc file. I do not know 
what the max setting is, but I know searches that return a small amount of 
entries Namaz

Here is an example case. Please note that hostnames and directory pathes have 
been changed, though not to the detriment of this report.

On the web interface, I run a search for the word "test". I get a result of:

Web interface results
References: [ (Too many documents hit. Ignored) ]

No document matching your query.

Up the top, I have this:
This index contains  5,072  documents and  122,292  keywords.

That would imply to me that there are a possible 5072 results. My .namazurc 
file is attached below in complete. However, here is the relevant sections:

MaxHit 5000
MaxMatch 200000

So by that, I should have got some results. 

Command line results
This is what happens when I run namazu from the command line:

[root(_at_)server cgi-bin]# namazu -n 2147483647 -f .namazurc test

References:  [  (Too many documents hit. Ignored)  ]

No document matching your query.
[root(_at_)server cgi-bin]#
Notice how I have specified an insane value for -n. The .namazurc file I 
specified is the one below. Surely there cannot be more than 2147483647 matches 
in a database that contains 5,072 documents and 122,292 keywords?

Here is the debug output:
[root(_at_)server cgi-bin]# namazu -n 2147483647 -f .namazurc test --debug
namazu(debug): NAMAZUNORC: ''
namazu(debug):   15: Directive:  [Index]
namazu(debug):       Argument 1: [/home/wwwroot/sites/]
namazu(debug):   23: Directive:  [Template]
namazu(debug):       Argument 1: [/home/wwwroot/sites/]
namazu(debug):   51: Directive:  [Replace]
namazu(debug):       Argument 1: [/home/wwwroot/sites/]
namazu(debug):       Argument 2: []
namazu(debug):   58: Directive:  [Logging]
namazu(debug):       Argument 1: [off]
namazu(debug): Scoring: tfidf: 0, dl: 0, freshness: 0, uri: 0
namazu(debug):   78: Directive:  [Scoring]
namazu(debug):       Argument 1: [simple]
namazu(debug):   85: Directive:  [EmphasisTags]
namazu(debug):       Argument 1: [<strong class="keyword">]
namazu(debug):       Argument 2: [</strong>]
namazu(debug):   92: Directive:  [MaxHit]
namazu(debug):       Argument 1: [5000]
namazu(debug):   99: Directive:  [MaxMatch]
namazu(debug):       Argument 1: [200000]
namazu(debug): load_rcfile: .namazurc loaded
namazu(debug):  -n: 2147483647
namazu(debug):  -w: 0
namazu(debug): query: [test]
namazu(debug): Index name [0]: /home/wwwroot/sites/
namazu(debug): set_phrase_trick: test
namazu(debug): set_regex_trick: test
namazu(debug): query.tokennum: 1
namazu(debug):[0]: test
namazu(debug): size of /home/wwwroot/sites/ 
namazu(debug): before nmz_strlower: [test]
namazu(debug): after nmz_strlower:  [test]
namazu(debug): do WORD search
namazu(debug): size of /home/wwwroot/sites/ 
namazu(debug): l:0: »¾
namazu(debug): r:122291: ãããããchmail2000(_at_)vip(_dot_)sina(_dot_)com
namazu(debug): searching: bodipy?
namazu(debug): searching: onwards,
namazu(debug): searching: technician's
namazu(debug): searching: unconfigured
namazu(debug): searching: ticket274.html
namazu(debug): searching: ticket1377
namazu(debug): searching: threadm
namazu(debug): searching: testing2
namazu(debug): searching: tender--
namazu(debug): searching: test3
namazu(debug): searching: terrible)
namazu(debug): searching: test()
namazu(debug): searching: tesoriero"
namazu(debug): searching: tesoriero</st1:personname>';
namazu(debug): searching: test"
namazu(debug): searching: test

References:  [  (Too many documents hit. Ignored)  ]

No document matching your query.
[root(_at_)server cgi-bin]#

If you would like, I could attach straces of the searchs above. I have looked 
through them and have not seen anything in there that gives me any hints. 

# This is a Namazu configuration file for namazu or namazu.cgi.
#  Originally, this file is named 'namazurc-sample'.  so you should
#  copy this to 'namazurc' to make the file effective.
#  Each item is must be separated by one or more SPACE or TAB characters.
#  You can use a double-quoted string for represanting a string which
#  contains SPACE or TAB characters like "foo bar baz".

## Index: Specify the default directory.
# Index         /usr/local/var/namazu/index
Index           /home/wwwroot/sites/

## Template: Set the template directory containing
## NMZ.{head,foot,body,tips,result} files.
# Template /home/www/corbett/www/lists
Template        /home/wwwroot/sites/

## Replace: Replace TARGET with REPLACEMENT in URIs in search
## results.
## TARGET is specified by Ruby's perl-like regular expressions.
## You can caputure sub-strings in TARGET by surrounding them
## with `(' and `)'and use them later as backreferences by
## \1, \2, \3,... \9.
## To use meta characters literally such as `*', `+', `?', `|',
## `[', `]', `{', `}', `(', `)', escape them with `\'.
## e.g.,
##    Replace  /home/foo/public_html/
##    Replace  /home/(.*)/public_html/\1/
##    Replace   /C\|/foo/     
## If you do not want to do the processing on command line use,
## run namazu with -U option.
## You can specify more than one Replace rules but the only
## first-matched rule are applied.
#Replace       /home/foo/public_html/
Replace         /home/wwwroot/sites/

## Logging: Set OFF to turn off keyword logging to NMZ.slog.
## Default is ON.
Logging       off

## Lang: Set the locale code such as `ja_JP.eucJP', `ja_JP.SJIS',
## `de', etc.  This directive works only if the environment
## variable LANG is not set because the directive is mainly
## intended for CGI use.  On the shell, You can set
## environemtnt variable LANG instead of using the directive.
## If you set `de' to it, namazu.cgi use
## NMZ.(head|foot|body|tips|results).de for displaying results
## and use a proper message catalog for `de'.
#Lang          ja

## Scoring: Set the scoring method "tfidf" or "simple".
Scoring       simple

## EmphasisTags: Set the pair of html elements which is used in
## keyword emphasizing for search results.
EmphasisTags  "<strong class=\"keyword\">"   "</strong>"

## MaxHit: Set the maximum number of documents which can be
## handled in query operation.  If documents matching a
## query exceed the value, they will be ignored.
MaxHit 5000

## MaxMatch: Set the maximum number of words which can be
## handled in regex/prefix/inside/suffix query. If documents
## matching a query exceed the value, they will be ignored.
MaxMatch        200000

## ContentType: Set "Content-Type" header output. If you want to
## use non-HTML template files, set it suitably.
#ContentType    "text/x-hdml"

Anthony Sadler
Far Edge Technology
w: (02) 8425 1410

Namazu-users-en mailing list

<Prev in Thread] Current Thread [Next in Thread>