At 12:41 AM 12/27/95, Daniel W. Connolly wrote:
Sure. I've been considering several storage mechanisms:
For searching, grep on #3 should be mighty fast. Appending is fast.
Rebuilding the database is fast.
A relational interface can be implemented on top of any of those,
with various tradeoffs. One reason why I starded this discussion
was to discuss that interface. By the way... does anybody have
details of the M$ ODBC API handy?
For what it's worth, I'm just using plain ol' files and character offsets,
then letting our search engine do the rest (support is built into it). I
just write a script that parses the mail and hands the fields, which
include the path, offset and length to it.
ODBC access comes for free (well, not literally, but I mean one wouldn't
have to write it, nor Sybase, nor Oracle) as a gateway, if that's where you
care to store your data. However, I'd tend to agree with Dan that "real"
relational queries might be overkill... until you do something like let
people rate messages, which creates a many-to-many relation when people
create agents that do things such as searching for messages about HTML that
Dan and Earl didn't rate as "boring" or "FAQ" -- for example.
And to be fair, I should add that most every serious commercial search
engine has similar capabilities. But one interesting item is that since
Netscape and a few other Internet vendors have chosen us, there's some
greater possible synergy.
If anyone is interested in an off-line discussion of related business
opportunities, e-mail me. It's something I've been thinking about for a