On 7 Jan, Jefferis Peterson wrote:
| Question 1:
|
| In this rc, how does ? Differ in function from the | :
| * $ $OR
| ^Received:.*[[(]209\.236\.([0-9]|[1-5][0-9]|6[0-3])\.[0-9][0-9]?[0-9]?[])]
|
| I presume that the " | "is the standard 'or' so that the third set of
| numbers could be a single character, or 2, but how does that differ from the
| 4th set of numbers which are separated by the ? Instead of the |
(this|that), an alternation, matches "this" or "that".
[a-m0-5], a character class, matches any single chracter between "a" and
"m", or between 0 (zero) and 5. What falls between is determined by the
ASCII table (man ascii).
? modifies the preceding character, character class, or parentheses
enclosed group to be optional (i.e. match 0 or 1 time).
* modifies the preceding character, character class, or parentheses
encloded group to match 0 or more (unlimited) times.
+ modifies the preceding character, character class, or parentheses
enclosed group to match 1 or more times.
Examples:
(abc|def)*(1a|2b) matches abcabcdef2b, 1a, but NOT abd1a
(abc|def)?[xyz] matches abcx, y, defz, but NOT abcdefx
[abc]+[123] matches a2, aa1, abcb2, abc3, but NOT 1, 2, or 3
([abc]|[123]) matches a, b, c, 1, 2, or 3
In your IP regexp, [0-9][0-9]?[0-9]? matches any sequence of 1-3 digits.
| Question 2:
| Can you shorten the IP address to cover 0-255
| I was wondering if you need to include the last digits:
| Received:.*[[(]212\.154\.3[2-6]?[])]
| I've identified a spammer who owns 212.154.32 to .36 /0 -255 in those
| ranges.
|
| I was wondering if the following recipe covers all those ip's or do you need
| to add in factors for the last 3 digits?
| * ^Received:.*[[(]212\.154\.3[2-6]?[])]
You can do what you want, but not that way. First off, all these
regexps are hosed because the [ and ] characters are special - denoting
character classes. If you want to match a special chracter literally,
it needs to be backwacked (backslash escaped) (e.g \[). Also, you don't
care what that last octet is, but it will be there so you have to allow
for it.
Try:
* ^Received:.*\[212\.154\.3[2-6]\...?.?]
The . (dot) matches any single character, so \...?.? will match any 1-3
digit octet after 212.154.3[2-6]. Note, it will also match ANY 1-3
character string, say "a", "=@", or "2b%", but that's probably ok in
this case. You do have to be careful matching Received: headers because
you get all kinds of different things there from different MTAs. But
this particular "looseness" doesn't seem likely to generate false
positives. If that's a concern, use:
"\[212\.154\.3[2-6]\.[12]?[0-9]?[0-9]]".
That's still not perfect, but better. If you wan't to narrow it down to
virtually no possibility of mismatch use:
"\[212\.154\.3[2-6]\.(0|[1-9][0-9]?|1[0-9][0-9]|2([0-4][0-9]|5[0-5]))]".
Note, I make no representation that that's the most efficient regular
expression, but it will match any legal octet.
--
Email address in From: header is valid * but only for a couple of days *
This is my reluctant response to spammers' unrelenting address harvesting
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail