procmail
[Top] [All Lists]

Re: Using Procmail for RBL Blacklists

2003-04-07 10:13:32
I suggested:
| [...]
| 
| :0
| * 1^1 ^\/Received:.*(by|from) (astro\.snellfamily\.com|\
|       jinx\.unknown\.nu)
| * ! MATCH  ?? from astro\.snellfamily\.com.*by jinx\.unknown\.nu
| {
|   CHECK=${MATCH}
|   :0
|   * CHECK ?? Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
|   { CHECKIP=${MATCH} }
| }
| 
| I just copied/pasted yours, but agree with Robert Arnold and Dallman
| that you would probably want to use a regular expression that more
| specifically matches ip numbers.  What you have could match a hostname,
| an mta or configuration file version number, etc. I have seen this when
| processing Received: headers with a less rigorous regular expression for
| ip numbers.  I use something close to Dallman's suggestion:
|  
| OCTET='(0|[1-9][0-9]?|1[0-9][0-9]|2([0-4][0-9]|5[0-5]))'.
| 
| The only difference is his could match something like "01" or "012"
| which I *think* should always be just "1" and "12" respectively. Not a
| criticism, just an observation.  Sometimes the trade-off for simplicity
| and/or efficiency over anal accuracy is the right one to make.  There's
| also matters of degree which only you can decide.

A couple more points.

1. There's no need to include "Received:.*" in the nested condition.

2. I have some Received: headers that enclose the ip number with
parentheses instead of brackets (qmail, maybe?). I also have a shell
script that looks at Received: headers, used before plonking netblocks
in sendmail's access.db. It allows for spaces enclosing the ip. It was 
a while ago, but I did extensive testing because of the critical nature
of the script; and though I can't find one in my inbox, I'm sure I saw
some ip numbers delimited that way.  (That's not to say it's rfc
compliant. I don't know either way, but enforcing that isn't my
purpose.)

Given all that, I'd rewrite the regular expression to allow the ip to be
enclosed with space, paren, or bracket.

OCTET='(0|[1-9][0-9]?|1[0-9][0-9]|2([0-4][0-9]|5[0-5]))'

* CHECK ?? [[( ]\/$OCTET\.$OCTET\.$OCTET\.$OCTET

In this case, we're not including the closing space, paren or bracket.
If that's added, you might also want to allow for a :portnum appended to
the ip.  (IIRC, it was tripwire.com that I saw doing that).

As far as the difference between Dallman's OCTET regular expression and
mine, I can find numerous false matches of the form "01." or "012." or
"003." in Received: headers.  (ccMail is a frequent culprit.)  None of
those, however, matches when a leading [[( ] is enforced. Also, it
appears they always appear after the ip number so procmail would match
the ip number correctly and never get to the false match.  Bottom line
is his regular expression is probably functionally the same as mine,
especially if the enclosing space, paren, or bracket is enforced.  This
is all seat-of-the-pants observation, so should be taken with one big
grain of salt.

Much of this, of course, falls into the category of "matters of degree"
mentioned in the previous post. 

-- 
Email address in From: header is valid  * but only for a couple of days *
This is my reluctant response to spammers' unrelenting address harvesting



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail