procmail
[Top] [All Lists]

Re: Using Procmail for RBL Blacklists

2003-04-07 08:43:27
On  6 Apr, Kim Scarborough wrote:
| Thanks for the help! I'm getting closer to getting this to work. I
| substituted this recipe for what I had, so now I'm using this to get the
| originating IP:
| 
| :0
| * 1^1 ^\/Received:.*
| * ! MATCH  ?? from astro\.snellfamily\.com.*by jinx\.unknown\.nu
| {
|       CHECK=${MATCH}
|       :0
|       *$  CHECK ?? Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
|       { CHECKIP=${MATCH} }
| }
| 
| This solves my original problem of not grabbing astro's IP, but now it's
| grabbing the first IP address in the received headers, which is usually
| forged. For example, for this message:
| 
| > Return-Path: <ryr16351k8s(_at_)yahoo(_dot_)com>
| > X-Original-To: sluggo(_at_)unknown(_dot_)nu
| > Delivered-To: sluggo(_at_)unknown(_dot_)nu
| > Received: from astro.snellfamily.com (astro.snellfamily.com
| [192.148.252.20])
| >         by jinx.unknown.nu (Postfix) with ESMTP id 4DE963D
| >         for <sluggo(_at_)unknown(_dot_)nu>; Sun,  6 Apr 2003 20:23:16 -0400 
(EDT)
| > Received: from RJ206019.user.veloxzone.com.br
| (RJ206019.user.veloxzone.com.br [200.165.206.19])
| >         by astro.snellfamily.com (Postfix) with SMTP id 987DD30040
| >         for <sluggo(_at_)unknown(_dot_)nu>; Sun,  6 Apr 2003 20:23:05 -0400 
(EDT)
| > Received: from go.com (9410 [190.229.217.178])
| >         by  voila.fr (8.12.1/8.12.1) with ESMTP id 7361
| >         for <sluggo(_at_)unknown(_dot_)nu>; Sun, 6 Apr 2003 17:16:38 -0700
| > Received: from go.com ([133.140.97.171])
| >         by sympatico.ca (8.9.3/8.9.3) with SMTP id 14765
| >         for <sluggo(_at_)unknown(_dot_)nu>; Sun, 6 Apr 2003 17:16:33 -0700
| > Message-ID: 
<23511595voxjjrCxqnqrzq1qx(_at_)blackman15(_dot_)freeserve(_dot_)co(_dot_)uk>
| > From: "dorrie" <ryr16351k8s(_at_)yahoo(_dot_)com>
| > To: "sluggo(_at_)unknown(_dot_)nu" <sluggo(_at_)unknown(_dot_)nu>
| > Date: Sun, 6 Apr 2003 17:16:28 -0700
| > Subject: Hot Girls Gone Bad (AVI-16)   voxjjrCxqnqrzq1qx
| 
| The recipe above is grabbing 133.140.97.171, but I want it to get
| 200.165.206.19. Basically, I want to say "grab the received IP from the
| line that says 'received by jinx.unknown.nu', unless that's the IP of
| astro.snellfamily.org, in which case grab the IP of the machine that astro
| received it from".
| 
| Sorry to be such a pest with this. I'm really confused by scoring recipes;
| I keep reading the docs hoping for a light to come on, but it hasn't yet.

It's not being a pest when you're working on it too.  Nor is it being a
pest when you're working with incorrect information.  My apologies, but
I never tested the suggested recipe with more than 2 Received: headers,
so didn't notice the obvious - it'll continue through all the headers.

The fix, off the top of my head, should be a more rigorous regular
expression on the first condition to make sure it matches one of your
two mail exchangers.

:0
* 1^1 ^\/Received:.*(by|from) (astro\.snellfamily\.com|\
        jinx\.unknown\.nu)
* ! MATCH  ?? from astro\.snellfamily\.com.*by jinx\.unknown\.nu
{
  CHECK=${MATCH}
  :0
  * CHECK ?? Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
  { CHECKIP=${MATCH} }
}

I just copied/pasted yours, but agree with Robert Arnold and Dallman
that you would probably want to use a regular expression that more
specifically matches ip numbers.  What you have could match a hostname,
an mta or configuration file version number, etc. I have seen this when
processing Received: headers with a less rigorous regular expression for
ip numbers.  I use something close to Dallman's suggestion:
 
OCTET='(0|[1-9][0-9]?|1[0-9][0-9]|2([0-4][0-9]|5[0-5]))'.

The only difference is his could match something like "01" or "012"
which I *think* should always be just "1" and "12" respectively. Not a
criticism, just an observation.  Sometimes the trade-off for simplicity
and/or efficiency over anal accuracy is the right one to make.  There's
also matters of degree which only you can decide.

When you're more comfortable with the whole thing, it would probably be
a minor efficiency improvement to change the weight of the scored
recipe to 1073741824 (or higher) (i.e. use 1073741824^1).  This will
stop procmail from continuing to look at Received: headers past the
second one.  You will never need/want to go beyond that; and for this
application it should be safe because I don't think there's any way for
a spammer to get a forged Received: header in between those from your
mail exchangers.

I haven't looked closely at Robert Arnold's solution. The logic looks
correct. You can test them both and decide which you like.  To that end,
I'd like to clear up one more thing omitted from my first post.  I tried
following up, but messed up my From: editing and don't know if it got
through.

Put this in testrc:

---(cut here)---
DEFAULT=/dev/null
NL="
"

:0
* 1^1 ^\/Received:.*(by|from) (astro\.snellfamily\.com|\
        jinx\.unknown\.nu)
* ! MATCH  ?? from astro\.snellfamily\.com.*by jinx\.unknown\.nu
{
  CHECK=${MATCH}
  :0
  * CHECK ?? Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
  { CHECKIP=${MATCH} }
}

LOG = "CHECKIP=$CHECKIP$NL"

HOST
---(cut here)---

You can also put Robert's version in there, but I'd reset CHECKIP
between the two recipes (either CHECKIP="" or simply CHECKIP) to make
sure each one is setting that variable.

You can test by doing: procmail /path/to/testrc </path/to/testmsg

The difference between this and the original post is the setting of
DEFAULT=/dev/null and the unsetting of HOST. You don't need both. 
Either one will do.  The way I described it originally, the test message
would be delivered to DEFAULT.  It shouldn't hurt anything, other than
being an annoyance.  This will bit-bucket the test message.

-- 
Email address in From: header is valid  * but only for a couple of days *
This is my reluctant response to spammers' unrelenting address harvesting



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail