procmail
[Top] [All Lists]

Re: Spammish?

2003-02-16 12:45:28
At 11:16 2003-02-16 -0500, fleet(_at_)teachout(_dot_)org wrote:

Ok, but if one is running all the messages through all the recipes; does
it make any difference?

Again, look back at my example of various _individual_ things which aren't a guarantee of wrongness, but when something has several of the atrtibutes, trouble is brewing.

I can see where it might be helpful if one is going to say "If you're over 5, you're spam, goodbye."

Basically, that's it - but you have tuning control over how much any given recipe contributes and what your final threshold is. And, most importantly, all the conditions aren't in one big mondo recipe - they remain as separate recipes.

Let us say you currently you have the following recipes, and they work:

:0
* some condition
{
        LOG="SPAM Advisory: non-rfc 2822 messageid$NL"
}

:0
* some condition
{
        LOG="SPAM: porno keyword match$NL"
        :0:
        spam.mbx
}

:0
* some condition
{
        LOG="SPAM: spam mailer agent$NL"

        :0:
        spam.mbx
}

There's nothing inherently wrong with ditching your spam as soon as you match some positive characteristic (say, porno stuff), but you'll never know if the mailer agent would have matched it as well. Likewise, if you moved that advisory bit down to the bottom, if it was tagged as spam by either of the other two recipes, you wouldn't know that it would have flagged the advisory.

What if, instead of ditching the messages immediatley, you waited until all the tests had been run:

:0
* some condition
{
        LOG="SPAM Advisory: non-rfc 2822 messageid$NL"
        SPAMMISHNESS="+5"
}

:0
* some condition
{
        LOG="SPAM: porno keyword match$NL"
        SPAMMISHNESS="+10000"
}

:0
* some condition
{
        LOG="SPAM: spam mailer agent$NL"
        SPAMMISHNESS="+10000"
}

# evaluate the additive score (see another post of mine)
SPAMMISHNESS=`echo "0${SPAMMISHNESS}" | bc`

# if we exceed (say) 250, then we're spam.
:0:
* -250^0
* $ SPAMMISHNESS^0
spam.mbx

Either of your two previous absolute hits will overflow the spam condition, but you see what ALL of the tests evaluated as, and if you've got a dozen different "iffy" conditions, they can individually contribute to the bottom line, if you want, to whatever degree you believe they should.

Yes.  I was concerned because I sometimes see things in the log that seem
to get out of sync

Separate and concurrent procmail prcessses, which do not see one another.

this brings up another question.  Someone in this discussion mentioned
unsetting a variable by the simple means of including it in the rc without
any parameters.

Yes, in my /etc/procmailrc, because a shell may be necesary, and not all users have a shell, I save the original shell, and define a shell for the script to use - and when exiting reassign the shell to the saved value and fully UNSET the "ORGSHELL" variable.

I retrieve mail "manually" (ie, I invoke fetchmail when I
want to get mail) so procmail terminates after each mail session.

Uh, the variables are not retained between sessions - but during a procmail invocation (for exactly ONE message, or internally 'c'loned copies of one, as of the time the clone was made), the variables are retained.

Is it necessary for me to unset my SPAMFLAG = "yes" variable at the beginning of each session?

No. When you start procmail, only the internally defined variables (see the manpages) are defined. Nothing else exists until you make it so.

Intellectually, I understand what you're doing.  My view, I suppose, is
terribly parochial; but *so far* all but a couple of my recipes have
returned absolute results.  I suspect, based on these conversations, the
the *so far* is going to bite me sometime; but in the meantime, the bottom
line is contributing to my hardheadedness.

I don't recall anyone putting it forward that "spammish" was something everyone would want to use. Anyone who already uses scording should have a grasp on "contributory" factors.

Few things are absolute. It is nice when they are, but what we really have are varying levels of grey, and "spammishness" allows you to easily take into consideration the less than absolutes, rather than ignoring them because they are not absolute.

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>