procmail
[Top] [All Lists]

Re: Spammish?

2003-02-15 12:36:46
At 13:41 2003-02-15 -0500, fleet(_at_)teachout(_dot_)org did say:
In the recent past it has been suggested to me that I could allow messages
to "bubble" down

Trickle down.  Bubbles tend to go UP.

One can take positive matches and toss messages when those are encountered, and less positive "indicators" and just collect them all up. Using a running "score" is a good way to do this.

understand, from context, what "spammish" means; but I'm having a problem
with the concept that a message can *be* "spammish."  In my simplistic
world, a message is either spam or it isn't.

Lots of references to "free" might be spammish. Or lots of $ (though on some programming lists, this isn't a good criteria). You could have legitimate mail containing either of these, but generally, it is used excessively in spam. Messages with cleartext addressees that don't include you, and it's not a list you subcribe to (and would presumably be filtering for already), could be spammish. Messages not from: your domain, but with the messageid containing your domain.

more spammish things:

Messages with only one received header (your server puts one there, and any user or mailing list should have one from their server receiving it from them as well, or at least a record or it originating on their server.

Messages passing through your backup mx (when you know that it is virtually NEVER involved with email). By itself, this is a bad thing to dump mail for, because at some poing, your primary MX will be unreachable due to a network outage, and your backup MX will be used. However, a fair amount of spam is injected through backup MX because the spammers know that most backup MX don't use DNSBL's (otherwise, they'd be rejecting mail for you based on criteria which you might not agree with).

I believe the process would involve (using formail) the addition of a flag
so the message header at the end of the "bubble" session would include
something like:

X-SPAM: Failed rule 2
X-SPAM: Failed rule 3
X-SPAM: Failed rule 7

Yes.  Or all in one header:

X-SPAM: rule 2; rule 3; rule 7;

(I believe dman (and others) assign a "score" to the message; but to me
the concept appears to be pretty much the same.)

A single score is far easier to use when making a decision to toss a message:

# if our total score is > 150, it's spam.
:0:
* -150^0
* $ $TOTALSCORE^0
spam.mbx

Different rules may contribute different score values to the end result - use of your backup MX might be less of a spammish indicator than a single received, or a bogus format messageid.

running a cumulative score (or even a tentative X-SPAM-SCORE: header with reason codes) in a variable is a LOT less costly than invoking formail to modify the message header over and over.

:0
* some spammish test
{
        XSPAMSCORE="${XSPAMSCORE} spamtest 23;"
}

Also, you can have _positive_ match spam conditions just set an insanely high score instead of depositing the message in your spamfolder right away, and still run through all of your spam checks, so that you can see what the other recipes would have categorized it as. That can be useful later for determining whether certain characteristics might actually be more consistent spam indicators than first thought.

There are a LOT of ways to do this.

Is this process designed to "tweak" the filtering to reduce the
possibilities of false positives?  Is this the only purpose?

Yes - one or two characteristics may be mere coincidence, but as the number of coincidences increase, the likelyhood that it is spam does as well.

In the six months (admittedly a short time) I've been using procmail, the
only false positives I've recieved have been the result of a poorly
written recipe - ie, my fault.

During that time, how much spam has sneaked past your filters? It is those spams which "spammish" cumulative scoring helps to eliminate from your inbox.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>