mharc-users

Re: subject line truncated in archives search

2004-12-17 14:32:22
On December 16, 2004 at 15:51, Dave Dewey wrote:

Background - I recently rebuilt my archives index due to some issues with
Namazu related to encoding on a new mailserver.  Everything seems to be
fine, Namazu now runs from cron with no errors.

Which version of namazu are you using?

However, when you do a search in the list archives, the returned search
results page contains mostly random single-character subject lines.
Sometimes a single letter or a number, often a single question mark, and
sometimes the full subject.  The two-line content snip underneath is also a
single character, although the Author and Date lines are fine. Using the
Thread or Date indexes are fine, no problems, all subjects are intact, it's
only when searching that the problem appears.

Take a look at the file NMZ.field.subject in the html directory
for the archive you are searching.  The file should contain the
subject's of the message pages, one subject per line.  Check to
see if any of the lines are blank or contain junk.

You can also test subject-specific searching in namazu by doing:

  +subject:<text-here>

This way you can verify if it is only a subject display issue
in the search results or if subject text is messed up in the
search index.  Doing a search like above should cause namazu
to use the NMZ.field.subject file.

Another thing to do is to create a sample archive containing the
message files giving truncated subjects in search results.  You can
do this by just copying a msg#####.html file into a test directory.
Then run mknmz on the directory and see if the file is indexed
properly.  You can examine the NMZ.field.subject file created during
indexing to see if the subject was properly extracted from the file.

If it was not, then there could be something wrong with mhonarc
namazu filter that is used during indexing.  You can provide
me with the sample problematic message and I can see what is
going wrong.  If you want to debug yourself, the mhonarc namazu
filter is mhonarc.pl and is located in the namazu filter installation
directory, like /usr/local/share/namazu/filter if you installed
namazu from source.

--ewh

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHARC-USERS

<Prev in Thread] Current Thread [Next in Thread>