namazu-users-en
[Top] [All Lists]

[Namazu-users-en] Re: namazu stopped working

2005-11-24 09:27:46
IEM - network operating center wrote:

debugging output:
running "namazu" from the command-line with debugging options gives me 
following ("OSC" is a keyword which pops up every now and then)

%> namazu -c -d --config=pd-list "OSC"
namazu(debug): NAMAZUNORC: ''
namazu(debug): load_rcfile: /etc/namazu/namazurc loaded
namazu(debug):    5: Directive:  [Index]
namazu(debug):       Argument 1: [/var/lib/namazu/index/pd-list]
namazu(debug):   12: Directive:  [Template]
namazu(debug):       Argument 1: [/var/lib/namazu/index/pd-list]
namazu(debug):   14: Directive:  [Replace]
namazu(debug):       Argument 1: [/var/lib/mailman/archives/private]
namazu(debug):       Argument 2: [/pipermail]
namazu(debug):   20: Directive:  [Logging]
namazu(debug):       Argument 1: [on]
namazu(debug): load_rcfile: pd-list loaded
namazu(debug):  -n: 20
namazu(debug):  -w: 0
namazu(debug): query: [OSC]
namazu(debug): Index name [0]: /var/lib/namazu/index/pd-list
namazu(debug): set_phrase_trick: OSC
namazu(debug): set_regex_trick: OSC
namazu(debug): query.tokennum: 1
namazu(debug): query.tab[0]: OSC
namazu(debug): size of /var/lib/namazu/index/pd-list/NMZ.t: 132748
namazu(debug): before nmz_strlower: [OSC]
namazu(debug): after nmz_strlower:  [osc]
namazu(debug): do WORD search
namazu(debug): size of /var/lib/namazu/index/pd-list/NMZ.ii: 1492960
namazu(debug): l:0: !
namazu(debug): r:373239: µÎ¬
namazu(debug): searching: ..)
namazu(debug): searching:
namazu(debug): searching: khz.
...

so after a bit more research i found, that NMZ.ii does not return the 
correct offset.
as far as i understand it the search::nmz_binsearch() performs a binary 
search of the keyword using NMZ.wi to look up which byte-offset a given 
line has in NMZ.w (with each keyord in a separate line)
it first starts with line 186620 [=(373239+1)/2=(r+1)/2] which in fact 
contains "clean;" but namazu thinks that it contains "..)"

more research revealed, that the byte-offset returned from NMZ.wi points 
into the middle of a line "clean....)"; however, since the so found term 
in "..)" the binary search miserably fails.

i guess it is a problem with some multi-byte characters.
(which reminds me that when i build the index i get some warnings:
"Wide character in print at /usr/bin/mknmz line 2447, <GEN7162> line 
158600.")

any hints how i should proceed?

mfg.asdr
IOhannes
mfg.asdr.
IOhannes





_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en

<Prev in Thread] Current Thread [Next in Thread>