Invalid IP numbers (was Re: SPAM>Re: Marketing tool)

On Sat, 7 Jun 1997 18:15:53 -0700 (PDT),
Dave/WebMaster <ddave(_at_)ddave(_dot_)com> wrote:

# Illegal IP used in a "Received:" field
:0:
* ^Received:.*\<(\[|[0-9]?[0-9]?[0-9]\.)\
([0-9]?[0-9]?[0-9]\.([3-9][0-9][0-9]|2[6-9][0-9]|25[6-9])|\
([3-9][0-9][0-9]|2[6-9][0-9]|25[6-9])\.[0-9]?[0-9]?[0-9])\
(\]|\.[0-9]?[0-9]?[0-9])\>
$PINEDIR/spamfolder
Someone sent me the above recipe for trapping illegal IPs. I used it for 
a while but it seemed to catch a lot of legitimate stuff too. What's 
wrong with this filter?


Let's take it apart. First, let's call the paren which starts with
[3-9] "invalid" -- this is fairly similar to the one I had, basically
looking for numbers in the range 256-999. And let's call the string
[0-9]?[0-9]?[0-9] simply "any"; it permits any number with three
digits or less. We then get

  * ^Received:.*\<(\[|any\.)(any\.invalid|invalid\.any)(\]|\.any)\>

So the middlemost paren is actually the one which matches an invalid
number, and the embracing parens only secure it inside valid contexts.
This means you can catch \[invalid, or \[any\.invalid, or
any.any.invalid, etc. but the invalid number is always enclosed
between another octet and an open or close bracket, or between two
other octets. Fine so far (except it permits IP numbers with leading
zeros, which are also often present in forged headers but not
elsewhere). 

The problem, as far as I can tell, is that the search is still not
anchored to only IP numbers -- lots of Received: lines contain lots of
other numbers, such as the processing mailer's version information.
Most of the matches I found in a quick test were Microsoft Mail Server
lines, which contain a version number something like 4.0.994.63 --
Corey Snow, for example (oh, an acquaintance of Dave's and mine :-)
seems to be running this software. (I knew he was in bad shape -- he's
running NT and even seems to +like+ it.)

If you have a few false matches on store somewhere, you can test this
hypothesis and perhaps tighten up the regexp accordingly. Generally,
the forgeries seem to be fairly similar to standard sendmail Received:
lines (or hopelessly botched ... oh well, maybe we should outlaw all
Microsoft software and be done with it). 
  As a primitive first aid, you could add a line before this one to
exclude anything which is Received: by.*Microsoft Mail Server, but
this is going to be problematic because there are usually several
Received: lines and if any one of them is a Microsoft one, the message
is skipped altogether. (Another case for MATCH, actually.)

Hope this helps,

/* era */

-- 
Defin-i-t-e-ly. Sep-a-r-a-te. Gram-m-a-r.  <http://www.iki.fi/~era/>
 * Enjoy receiving spam? Register at <http://www.iki.fi/~era/spam.html>