procmail
[Top] [All Lists]

Reverse engineering spamassassin rules

2004-03-17 13:29:40
I started using procmail for filtering a couple of years ago, but have recently supplemented it with bayes (bogofilter) and spamassassin, trying to find a good middle ground. I like spamassassin for its blend of capabilities, but there's no denying it imposes performance overhead..

There are a handful of spamassassin rules that hit significant numbers of incoming spams. Converting some of these to procmail recipes would allow a "coarse screen" to be put in place with procmail, avoiding the need to process obvious spam through other tools altogether (part of my beloved layered defenses).

An example rule that is detecting many of the random-word, bayes-poison spams:

body PT_WORDLIST_30 /(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+
){30}/
describe PT_WORDLIST_30 string of 30+ random words
score  PT_WORDLIST_30   10.0

I know the regexp used is perl syntax, and not all features ({}) are available to procmail. But from reading through the procmail howtos, I believe a scoring rule might be used to score the ratio of articles and prepositions (and punctuation) to "other" words can achieve much the same result.

Before I meander too far down this path though, I wanted to see if there are any good collections of such recipes already available. I've seem some basic rules, but many of the trickier spams seem to get past those. I'm out to match characteristics rather than specific phrases.

Any thoughts appreciated.

- Bob




_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>