procmail
[Top] [All Lists]

Re: Complex filtering

2006-02-08 12:47:14
At 11:27 2006-02-08 -0500, Matee Moshkovits wrote:
hi gurus of all that is procmail
I need to do some complex filtering and dont know how. here is the suto
code

I suspect you mean pseudo-code.

... and I need to know how to do it in procmail spamkiller=" a long
list of email addresses that spammers target on my system"
if TO = spamkiller then

:0:
* ^TO_(userportionaddr|userportionaddr2|userportionaddr3|etc)@yourdomain\.tld
* 9876543210^0 B ?? online pharmacy
* 9876543210^0 B ?? grow a bigger schwang
* 9876543210^0 B ?? please your dog
$HOME/mail/mail/SPAM

The first condition matches for recipient address, and must match for the 
rest of the recipe to continue processing.  The remaining conditions are 
SCORED (see 'man procmailsc'), and in such a way that as soon as ANY of 
them match, the message is considered to have met the necessary criteria, 
and it proceeds to the action (delivery to the mailbox file).

If you're not already fairly conversant in procmail, you're likely to find 
this approach to be less than effective - either a LOT of spam will still 
get past the filter, or a lot of legitimate messages will be filed as spam.

A LOT of spam these days relies on "leet-speak", intentional misspelling, 
and encoding of text (partial, ordinal, escaping, etc), so while you may 
see a certain text in a spam, it doens't mean that a simple keyword match 
for it will match that message.

A number of procmail users here (myself among them) maintain that if you're 
going to use procmail for spam filtering, you'll have much better 
effectiveness by matching for common characteristics in the headers (say, a 
non-local user submitting a message to your server, but there is only ONE 
received: header (the one YOUR host inserted)).

# single Received: header.  You could discount messages relayed through
# your hosting ISP (if any), since those would be messages which "orignated"
# AT your MX level, not at a list or user prior.
:0
* ! ^X-Envelope-From:.*@(host\.)?domain\.tld
* 2^0
* -1^1 ^Received:
{
         SPAMVAL="+125"
         SPAMMISHNESS="${SPAMMISHNESS}${SPAMVAL}"
         SPAMNOTES="${SPAMNOTES}SPAM: ${SPAMVAL} Single received header for 
foreign sender${NL}"
}

I user a "spammish" system whereby various characteristics are assigned a 
certain value, and after checking all the characteristics, if the message 
exceeds a threshold, it is filed away.  Thus, I don't have to rely on any 
one characteristic, and no characteristic needs to be an absolute 
true/false indicator of spam.  BTW, matching on an excessive number of 
indicators ITSELF has proven to be an indicator of spam.

some more else if statments.

No real need for "else if" in the context of what you're trying to do - 
inherently, anything that doesn't match the first IF will reach the second 
if and be handled there, since anything which matches is being filed away 
(and thus the recipe processing stops there).

---
  Sean B. Straw / Professional Software Engineering

  Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
  Please DO NOT carbon me on list replies.  I'll get my copy from the list.


____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>