procmail
[Top] [All Lists]

Re: Dealing with duplicate messages

1997-07-13 04:30:00
On Sat, 12 Jul 97 20:16 EDT, 
process(_at_)qz(_dot_)little-neck(_dot_)ny(_dot_)us 
(Eli the Bearded) wrote:
# Deal with duplicates, as determined by Message-ID, but not if they
# have been looped through for multiple passes.
# :0 Wh: msgid.lock
* ! ^X-loop:.*qz.little-neck.ny.us
| formail -D 32768 msgid.cache
     # Put dupes in a dupe file
     :0e:
     duped-mail

How's this instead:

    :0W:msgid.lock
    * ! ^ X-Loop:.*qz\.little-neck\.ny\.us
    * ? formail -D 32768 msgid.cache
    duped-mail

This is simpler than Timothy's suggestion because we can use the cache
checking built into formail, obviating the need for a separate cleanup
script and so forth.

Also has anyone written a program that works just like 'formail -D'
for arbitrary fingerprints extracted from text? I think I could get
good spam reduction with a fingerprint composed of From: and 

I've been thinking the same thing for a long time -- this doesn't
belong in formail, and it would benefit a lot from being broken out
into its own program and generalized. In the meantime, I've been
toying with the idea to construct a phony Message-Id from the
information I do want to check against:

    :0
    * ^From:[   ]*\/[^  ].*
    { FROM=$MATCH }

    :0
    * ^Subject:[        ]*\/[^   ].*
    { SUBJ=$MATCH }

    :0:Dupe.lock
    * ? echo "<$SUBJ_$FROM>" | formail -D 8192 Dupe.cache
    { formail -rt -I"Subject: You wretched spammer you" | ... }

... but this is too warped :-) [and no, I haven't actually tested it]

/* era */

-- 
Defin-i-t-e-ly. Sep-a-r-a-te. Gram-m-a-r.  <http://www.iki.fi/~era/>
 * Enjoy receiving spam? Register at <http://www.iki.fi/~era/spam.html>