Harry Patterson wrote :
|2) I didn't see where Wilma/Glimpse will automatically add new pages to the
|Glimpse search index (not mhonarc index) as I believe HTDIG will allow. The
|indexing seems to occur in the cron job on a nightly/periodic basis. Can the
|new postings be indexed on the fly in order to keep them current?
|3) I may try to use the Wilma package as is but leave out the Glimpse
|nightly index and instead use HTDIG for all searching of the directories but
|Wilma for the yymm filing system. Any comments?
A couple of month ago I've been testing both search engines WebGlimpse and
HtDig and here are a few comments :
- WebGlimpse is more likely to be tuned to minimize disk occupation.
- HtDig is more likely to be tuned to select very sharply what you
want to be indexed or not.
- WegGlimpse seems to be faster and less cpu consuming when indexing
- Although HtDig can index a single URL, merging the indexing data with
the rest of the database is very time and cpu consuming and I doubt
you could do that for each new message incoming within MHonArc files
unless your archive and your traffic is very small.
- HtDig can take a lot of disk space for its databases especially if
you're using its "soundex" facilities.
- due to disk space concerns I didn't switch to any of both search
engines - I'm still using marc-search although my archives contain
several thousand messages. It works fine.
- I think WebGlimpse and HtDig are best suited for Intranet indexing
(small intranet for WebGlimpse and widest for HtDig).
- There's another solution : using a free version of a commercial search
engine as Excite but can only be used on the localhost.
Hope that could help.