[Mailed and mailed.]
On Thu, 13 Jun 1996 17:11:07 -0500 (CDT),
AIRWAVES MEDIA <rrb(_at_)clm(_dot_)aiss(_dot_)uiuc(_dot_)edu> wrote:
} Too be honest - I doubt killing on specific subjects will be that
} effective. These change with every mailing.
There are strings that I want to use, not necessarily complete
subjects. Strings lime two dollar signs in a row, the words 'USA
Magazines' and others.
Here's what I use:
SPAM="!!!+|\$\$+|(,000)+|magazine| ... etcetera, make your own ;-)
^From:(_dot_)*(_at_)[^ ]+\.com[ ]
That's the basic idea. I have different variations for different
situations, but you should get the picture. The variable SPAM is set
to a (largish; few dozen) set of words I hunt for. Only one match will
not do; if three of the words are matched, the message is fried. Then
as you can see I have lowered the threshold for some sites.
This could be made a lot more straightforward with scoring (man
procmailsc) but I have yet to see an implementation. I have asked on
this list if somebody was using scoring to hunt for spams but no
replies so far.
My site is still using an ancient version of Procmail and I haven't
felt the urge (and/or had the time) to set up my personal upgrade, so
I don't even currently have a Procmail with scoring. The above
emulates scoring, in a way, but it's very limited, of course (and a
bit of a drag on the maintenance side ...)
For the record, I've had +very+ few matches on these, but then, I've
largely been spared from spams lately (knock on wood. Maybe the
spammers have started excluding non-American domains in a rush to, er,
get more focused :-).
Also, like Wotan and others have commented, it's extremely hard to
think up a scheme that works for all possible cases (with few or no
mismatches). The following have still slipped through:
Subject: Unusual Promotion for 1st Time Users
Subject: $800,000.00 Sweepstakes!
Subject: May I please have your permission?
Subject: 1000 Shares
The first one contains enough suspect words for this scheme to work,
in principle. For the "sweepstakes" case, I suppose a special rule for
^Subject: ($SPAM) ($SPAM)[!?.]?$ could do, or then you could just examine
the content a little bit for a high frequency of slime words. But then
you'd definitely want scoring.
Hope this helps,
/* era */
See <http://www.ling.helsinki.fi/~reriksso/> for mantra, disclaimer, etc.