procmail
[Top] [All Lists]

Re: | vs. ? And another basic question

2003-01-08 20:42:51
On Wed, 08 Jan 2003 17:46:45 -0500 (EST), Don Hammond wrote:
=> On  8 Jan, Peterson wrote:
=> | > => :0
=> | > => * ^Received:.*[[(]209\.163.\.100.[0-9][0-9]?[0-9]?[])]
=> | > => { some action here }
=> There's a typo in the condition qouted above. There's an extra . (dot)
=> after "163".  Also, the . (dot) after 100 should be backslash escaped. 

.       Sorry about that. The above quoted typo was fixed in
detail in a later posting (in another forum as written about
earlier in the thread):
----- <snip>
It says for any message passed to it
- check the msg headers (default)
- look for a header that starts with "Received:"
- and then zero or more characters until
- either an open paren or open square bracket
- then 209.163.
... oh I see it now ...
its looking for 209.163#.100.x where # is any single character
  (the non-escaped typo'd period)
  followed by a real period
----- <snip>
        This correction was apparently then overlooked and it
obviously prevents proper operation of the above recipe.

=> I see now that
=> it is opening a character class of "[" and "(", and that the regexp has
=> a corresponding character class to close with "]" or ")". 

        I was the original author of that ip conditional quoted
in the first in this thread which confused the poster: 
^Received:.*[[(]209\.236\.([0-9]|[1-5][0-9]|6[0-3])\.[0-9][0-9]?[0-9]?[])]
He was totally confused about the basic syntax of the ? and |
usage.

        Unfortunately these character classes are sometimes very
difficult to read ... at least until one closely parses the line,
character by character, while understanding the rest of the basic
syntax.

        Doesn't help that this character class included, in the
"special rule applies first position", the open and close square
brackets that otherwise delimit the character class.  Nontheless,
they work great, just like the rest of procmail at least when
used with precision.

=> FWIW, going through 10449 Received: headers, I found:
=>   7563 with  [IP#]  72.3%
=>    250 with  (IP#)   2.4%
=>     27 with   IP#    0.3%  # i.e. simply whitespace enclosed
=>   2609 without IP#  25.0%

        Thanks very much for the research.  This is almost
totally opposite from my experience and therefore helps me a lot.

        The *shared* server (hosting) environment where this
discussion started is one in which the local final incoming
received header always uses *parens* to delimit IP#s -- looks
like this one coming from this mailing list (on two lines,
machine name edited):
~~~~~
Received: from ms-1.rz.rwth-aachen.de (HELO
ms-dienst.rz.rwth-aachen.de) (134.130.3.130)
  by MACHINENAME.EXAMPLE.com with SMTP; 8 Jan 2003 02:02:54 -0000
~~~~~

=> I have a script that searches saved mail for a given IP number that I
=> use when considering whether to block a netblock with sendmail.  I know
=> now that it needs fixing.

        In looking at a bunch of old saved spam just the other
day, I realized that my original IP-blocking recipes, which used
only parens, were "missing" known spamIP#s which were forging
thru open relays as they happened to use square brackets for
delimiters instead of my expected parens. Now my recipes work
more exactly as intended.

        Thanks again for the IP population figures. This list is
a fantastic resource - thanks to all the regular posters.

        Cheers,

        - Don

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail