nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Improving reading mime email.

2005-05-19 21:11:52
invoked when messages are inc'd, rmm'd, and refiled.  Part of my project,
grokmail, builds a real database from your mail messages.  So you can do

By 'real database' do you mean a Berkeley/MySQL-type DB?  Is this sort
of like supporting IMAP?

Actually, it's in Sleepycat, i.e., Berkeley DB.  It's a complicated setup.
Don't know anything about IMAP.

Of course, the real magic of grokmail is that you can train it by
ranking messages on a scale of 1-10 and then scan for interesting
messages.

Like GNUS for Emacs?  That would be really cool.  Bayesian filtering of
messages would be a funky feature, and not that hard to implement, from
my PoV, e.g.:
      * Nathan currently has 20 messages in his inbox.
      * He reads the one from "My Boss" first.
      * He then reads the one from "Brother in USA"
      * He then deletes (without reading) the three with "Rx" in the
        title (that somehow escaped the spam filter).
      * &tc.

=> Emails from "My Boss" or "Brother in USA" should be highlighted /
upgraded vs. emails with Rx in title should be downgraded.  All we'd
need to do is build in some extra smarts in scan/show/rmm that
monitored how we manage new mail against the mail corpus that is
there.  Messages could then be tagged (annotated?) according to the
learning, for use/manipulation by mh/Unix tools.

re,
N

Don't know anything about GNUS for Emacs.  But yes, it's a filtering
mechanism.  It's actually much harder to implement that you might
think.  Most Bayesian systems rely on occasional training when things
start going bad.  grokmail trains all of the time.  My client has close
to a half million messages in his mail folders, so it is non-trivial
to do this with reasonable performance.

I don't particularly agree with your usage model where reading a message
indicates that it's interesting.  Not a valid model from what I've looked
at.

One of the reasons that I did the first implementation of this for nmh is
that since it isn't a monolithic mail system it is easy to add new commands
to do new things.  The main commands are grank for ranking, and then gpick
and gscan which are analogous to pick and scan.

Jon


_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>