procmail
[Top] [All Lists]

Re: Getting the MATCH

2008-07-22 10:49:36
At 09:54 2008-07-22 -0400, Skip wrote:

[snip - look Ma, I can trim to context, and I don't top post!]

my ipblacklist currently contains over 4000 ip addresses from which I have previously received spam emails. I decided to just go with the first three octets
96.235.243.
96.239.43.
96.241.203.

I you're actually expecting to match against IP addressed ONLY, why not parse the received lines only?

Also, check the procmail list archives - there have been functional and _reliable_ ip parsing recipes posted here. You can even reverse the octets and look 'em up in a real DNSBL (which, JFTR, you could manage your blacklist as instead).

The list isn't perfect. There are software version numbers that look a lot like IP addresses that can fool the system. I have run my blacklist against my clean inbox and have removed every entry that returns a hit. (Would you believe that my ipwhitelist is only just over 1000 entries--I get *that* much more spam than ham!) I also don't add any numbers to the blacklist if they return any hits in my inbox at all.

I have appreciated everyone's responses here, but unfortunately, I think I am confused. Would someone be so kind as to put it all back together for me in one working recipe? I guess my initial question of being able to return the actual matched ip address (kinda like using the -o option in grep)

The problem is that grep will generally show the line in the match text which matched - and that's the WHOLE BLOB OF TEXT you're piping to grep, not the individual line in the pattern file. You could pipe it through sed or awk, or a perl script to split it into individual tokens before grepping, and then at least the "line" that matches would be the individual token.


FROM_MATCH=`formail invocation | sed -e 's/\([       ][      ]*\)/\
/g'  -e "s/[][()<>;]//g" | fgrep -i -w -f $ipblacklist`

# (that's a hardcoded newline in the middle there)

:0fw
* ! FROM_MATCH ?? ^^^^
| formail header modification


You will likely want to expand upon the second sed transform, but as-is, it does a fair job of cleaning the cruft up - the invocation is off the top of my head, not based on some recipe I'm using, and should delimite most of your stuff - you don't want dots, and probably not hyphens either. You might actually want to NOT exclude the square brackets, since those should often encompas IP addresses in Received: headers (depending upon your MTA).


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>