procmail
[Top] [All Lists]

Re: Dealing with duplicate messages

1997-07-12 19:34:00
        process(_at_)qz(_dot_)little-neck(_dot_)ny(_dot_)us (Eli the Bearded) 
wrote on
        Sat, 12 Jul 97 20:16 EDT in
        Message-ID: 
<10031%9707122017(_at_)qz(_dot_)little-neck(_dot_)ny(_dot_)us>

(I want a very large cache because I get maybe 3500 messages a week
passing through this rc file and I want to cache at least a full week.)

I whipped this up rather hastily:

# the dir where all my procmail stuff goes
PROCDIR=${HOME}/.procmail

ID=`formail -xMessage-ID:`

# if the message ID is found, add it to the mailbox 'Dups'
:0W: .mesg.lock
* ? fgrep -s "${ID}" $PROCDIR/messageids.txt
Dups

# otherwise, dump the ID to the database and continue
:0EWhci: .mesg.lock
|/bin/echo "${ID}" >> $PROCDIR/messageids.txt


Oh.... and then we have to trim it.... forgot about that.... Howabout this:

In Sunday night's crontab, put this:

tail -3500 /path/to/messageids.txt >  /path/to/messageids.txt.new && \
mv -f /path/to/messageids.txt.new /path/to/messageids.txt

which isn't the best solution (because there may be some message that  
arrives as this is being done.  The workaround for this would to use a  
lockfile I guess, but I don't know that much about it...

It's a hasty solution, as I said, but it seemed to work with the 3 messages  
I tried it with ;-)

TjL

ps -- Hebrew started as of 2 July, and ends on 22 August, so
please understand if responses are slow.  I am taking a 2
semester class in 8 weeks for 6 credits

--
TjL <luomat(_at_)peak(_dot_)org>