At 11:13 2005-06-21 -0500, Damian Menscher wrote:
Procmail wasn't exactly designed to filter spam. You should, at minimum,
investigate the scoring features. Or, if you're intelligent, check into
spamassassin.
Using or not using SA has no thing to do with intelligence. Several
contributors here on the procmail list get along fine without it. I've not
had any spam to my inbox in a week, and that's when I added a couple of
refinements to my filters to deal with some false pozzies on a few lists.
The irony of the regexp being used is that the body keywords will result in
his *OWN* message to this list not arriving back in his inbox.
To the OP:
Scoring would be appropriate if you're going to use simplistic terms -
require some number of them to appear before considering the message
junk. I score on a lot of terms which appear in regular conversation but
which are more frequently used in spam - they're just not scored ultra-high.
You should set up a sandbox and place your rules in there then pump a lot
of your email at it (old saved email, from BEFORE you started using the
filters, or grab up mailboxes from your spam account), so you can see just
what will be affected. Refer to the VERBOSE logfile to see what conditions
are matching the messages. If you use the MATCH construct:
* \/(term|anotherterm|yetanotherterm)
^^ this bit here
then the logfile will end up including a variable assignment to MATCH
showing exactly which of multiple regexp components on a single line was
the actual matched term (versus merely stating that the whole condition
matched somehow).
That would allow you to more easily identify what terms are entirely too
broad in your expression.
Perfectly legitimate siglines on some messages will contain toll free
numbers. re*move is a legitimate english word, some of the other terms are
too short and will (as already indicated by other replies) result in
matches in uu/base64 encoded files, and a unit of measurement isn't a wise
choice of singular word terms either.
Your conditions also make it clear that you seem to believe that they're
matched with case-sensitivity. They're not - unless you add the 'D'
flag. So, the bracketed character classes are unnecessary.
Your first rule has multiple condition lines, which *ALL* have to match in
order for the message to be caught by that ruleset. Break it out into
separate rules (one for the MessageID, another for the Subject, another for
the From:, etc). or use scoring - prefix each condition line like so:
* 9876543210^0 condition
that curious numeric is simply an easy to remember "maximal" value -
greater than 2^31 (signed 32 bit value), which says "when this condition
matches, disregard the rest of the scored tests and consider this message
matched" or something to that effect. If you have non-scored conditions,
they'll still have to be evaluated as TRUE for the rule to succeed. Read
'man procmailsc'.
See the URL in my sigline for links to the sandbox I publish (which will
also automatically redirect the forwards you do in your recipes).
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail