procmail
[Top] [All Lists]

Re: | vs. ? And another basic question

2003-01-08 17:09:56
On  8 Jan, Jefferis Peterson wrote:
| [...]
| 
| His explanation for the setting off the brackets is as follows:
| > The above recipe does not check all of those as many are
| > reported with square brackets instead of parens.  Parens work
| > well for testing the pair server applied header, but apparently
| > for checking previous recvd headers this recipe should be
| > expanded to use either parens or square brackets (be careful, as
| > the meta character rules are different within character classes).
| > 
| > => :0
| > => * ^Received:.*[[(]209\.163.\.100.[0-9][0-9]?[0-9]?[])]
| > => { some action here }
| > 
| > I have a concern that this range may be too broad and
| > would like to see other sites for spam from more than the 4
| > addresses mentioned above.
| > 
| > BTW: other spam recipes for testing IP ranges posted here
| > recently could probably be improved by including square brackets
| > as above.
| 
| So, I was trying to follow this logic, but I don't understand why the entire
| string is enclosed in brackets.  Is he looking for literal brackets?
| 
| For some reason, the spam from 209.163.100.11/ 12 etc. is still making it
| through my filters.

There's a typo in the condition qouted above. There's an extra . (dot)
after "163".  Also, the . (dot) after 100 should be backslash escaped. 
The first is the cause of the problem, the second is to prevent a
possible future problem.

Now, I have to eat some crow.  I read your first message too quickly and
assumed the opening "[" was trying to match literally.  I see now that
it is opening a character class of "[" and "(", and that the regexp has
a corresponding character class to close with "]" or ")".  When I said
they were "hosed", I was wrong.  They were more correct in that they
DID allow the IP number to be enclosed with either parentheses or
brackets.

FWIW, going through 10449 Received: headers, I found:

  7563 with  [IP#]  72.3%
   250 with  (IP#)   2.4%
    27 with   IP#    0.3%  # i.e. simply whitespace enclosed
  2609 without IP#  25.0%

That's not necessarily representative of anything other than my mail,
but there it is. Of the 27 with an IP# but without enclosing () or [],
16 appear to have a NAT'ed port number appended (or some identifier that
looks the same) and are all from the same domain and ok. Of the others,
none are spam or otherwise suspicious.

I have a script that searches saved mail for a given IP number that I
use when considering whether to block a netblock with sendmail.  I know
now that it needs fixing.

-- 
Email address in From: header is valid  * but only for a couple of days *
This is my reluctant response to spammers' unrelenting address harvesting



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail