namazu-users-en
[Top] [All Lists]

[namazu-users-en] Re: Hello..

2000-03-20 01:40:04

<http://www.senga.org/mifluz/html/description.html>
|   mifluz has been designed with the further upper limits in mind : 500
|   million documents, 50 giga words, 20 million document updates per day.

It is terrific!


 Well, I'll be very honest. At present this goal is not completely
achieved.  Yes you could build an index of 500 million documents (1
Tera byte). Although it has not been tried, figures tend to show that
the data structure can hold that much. As for the 20 million updates
per day, we're not at that level, yet. Benchmarks show that mifluz is
currently able to index 1 million URLs a day in optimal conditions.
 (All tests on a PIII500 256Mb RAM, Linux).

I just printed out mifluz.texinfo and read it.  I notice
that it is really a high-performance library.  But at the
moment, I don't know whether or not it is good to employ
mifluz for Namazu.

 As of now mifluz is fairly easy to use. It's a single library to
link with (-lmifluz), has a perl interface (Search::Mifluz). And
it's designed precisely to help projects like namazu by providing 
an inverted index handling system so that namazu can concentrate on
the user-friendly, filters, interface part.

 It is in fact designed to allow projects that need an inverted index
to concentrate efforts on one product instead of re-inventing the
wheel.  Amazing the number of inverted index implementation you find
everywhere. Much in the same way as you use zlib instead of 
implementing your how LZ compression :-)

  * Rewrite query operations with lex and yacc.

 Have you considered the Text-Query Perl modules ? 

When above TODOs are completed, we will change over to 3.0
development and decide employment of mifluz.  I hope
mifluz's APIs will be fixed and well documented at that
time. :-)

 I hope so too :-) You feedback will be more than wellcome. I've 
commited a set of manual pages for that purpose last week.

         Cheers,

-- 
                Loic Dachary

                24 av Secretan
                75019 Paris
                Tel: 33 1 42 45 09 16
                e-mail: loic(_at_)dachary(_dot_)org
                URL: http://www.senga.org/



<Prev in Thread] Current Thread [Next in Thread>