procmail
[Top] [All Lists]

Re: Scoring question

2005-08-24 17:13:25
On Wed, Aug 24, 2005 at 04:51:51PM -0400, Louis Proyect wrote:

Sorry if this has been dealt with here in the past, but I
couldn't find it in the archives. I want to using scoring to
filter out spam on the basis of multiple "to addresses" that
include me and anybody else on my isp. In other words, if mail is
addressed to "lnp3(_at_)panix(_dot_)com" and "xyz(_at_)panix(_dot_)com", it 
should go
into /dev/null.

This is the recipe that I am using:


:0 H
* -1^0
* 1^1 panix
/dev/null

However, the log states:

With one panix address:
procmail: Assigning "TRASH=/dev/null"
procmail: Assigning "INCLUDERC=/net/u/15/l/lnp3/.procmail/rc.spam"
procmail: Score:      -1      -1 ""
procmail: Score:      22      21 "panix"
procmail: Assigning "LASTFOLDER=/dev/null"
procmail: Opening "/dev/null"

I don't understand first of all how it arrives at a score of
22. Secondly, I don't understand why the scoring for one panix
address and two are identical.

Simple, on the score being 22.  Coincidence, on the two being
identical.

You are counting all instances of the word "panix" in the
entire header.  Let's see how many times it shows up in this
email of yours I'm responding to:

 1:41am [~/Mail] 677[0]> headers lnp3 | fmt -1 | grep panix | nl
     1  dman+munged(_at_)panix(_dot_)com
     2  dman+munged(_at_)panix(_dot_)com
     3  mail2.panix.com
     4  (mail2.panix.com
     5          mailproc1.panix.com
     6          <dman+munged(_at_)panix(_dot_)com>;
     7          mail2.panix.com
     8          <dman+munged(_at_)panix(_dot_)com>;

Eight times.


You want to count To-addresses.  So look there, not
anywhere in the header.

Also, there is a bug in procmail when you use the H flag
like that.  It is the default action, anyway, so the
flag is unnecessary.

   :0:
   * -1^0 ^To:\/.*
   *  1^1  MATCH ?? @panix[.]com
   *  1^0  ()\/^
   * -1^0 ^Cc:\/.*
   *  1^1  MATCH ?? @panix[.]com
   MYSPAM


The middle condition is a way to clear the match value
in between reusing it with Cc:.  The reason is, if there
isn't a Cc: header, we'll still have the value saved
to MATCH from the To: header (if there was one).  This
gets rid of that.

The recipe is still vulnerable to an instance of you
being mailed like so:

   you(_at_)panix(_dot_)com <you(_at_)panix(_dot_)com>

or of you ending up on both the To: and Cc: lines,
e.g., by error when someone emails you and lots of
other people.

You can cut down on the first false poz by using a comma
delimiter instead of @panix.com.  Actually, that would
be just fine.  I kind of like this recipe.

   :0:
   * -1^0 ^To:\/.*
   *  1^1  MATCH ?? ,
   *  1^0  ()\/^
   * -1^0 ^Cc:\/.*
   *  1^1  MATCH ?? ,
   MYSPAM


Dallman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>