procmail
[Top] [All Lists]

Possible procmail filter to trap spam relays?

1998-06-08 21:55:22
Open relays are being used by spammers all the time now.
The following code must *NOT* be used on mailing lists, so they
have to be diverted/delivered first.  Here's my try at stopping
relay spam.  While I'm at it, I'd like to have an efficiency
question answered.  Which of the following methods of inserting
an "X-Reject:" header takes fewer system resources?  Is it...

:0:
* blah blah blah
| echo "X-Reject: blah blah blah">>junkmail ;cat - >>junkmail

...or is it...

:0:
* blah blah blah
| formail -A "X-Reject: blah blah blah" >>junkmail

The formail variant looks more elegant.  However, I'm always
thinking of resources.  The other question is whether every
installation that has procmail has formail.  I want this to
be a generic filter that as many people as possible can use.
As part of the generic setup, I define a variable MYISP,
which is a regual expression that matches *ALL VARIANTS* of
my ISP's domain name(s).  If you are "joeblow(_at_)someisp(_dot_)org",
then you set MYISP="someisp\.org".  My ISP owns both the
"interlog.com" and "interlog.net" domains, so I have to set...

MYISP="interlog\.(com|net)"

My filter depends heavily on "Message-Id", so first I have to
make sure that no-one is playing games with that header.  The
first check is that we don't have someone from outside my ISP
forging a "From:" or "Message-Id:" header that looks like it's
from our system.  The algorithm is...
  If the From: or Message-Id: claims to be from MYISP then
    -Add up the count of all "Received:" headers.
    -Subtract the count of all "Received:" headers that show the
       message being passed around internally by my ISP's system.
  End if

If the email is *REALLY* from someone else at my ISP, the two
counts should cancel out to zero.  If at least one of the headers
is not an internal handoff, the second count will be less than
the first count.  The subtraction will yield a positive result,
and the message will be delivered... to the junkmail file.

###########################################################
:0:
*$       ^(From|Message-Id):.*$MYISP
*   1^1  ^Received:.from
*$ -1^1  ^Received:.from.*$MYISP.*.by.*.$MYISP
| echo "X-Reject: Forged From:/Message-Id:">>junkmail ;cat -
junkmail
###########################################################

Check for more "stupid spammer stunts"...
  - Initialize the counter to 2
  - subtract 1 for a valid "Message-Id:" header
  - subtract 1 if we do *NOT* see two or more "Message-Id"

If the message has *EXACTLY* 1 valid "Message-Id", the accumulator
reads zero, and no action.  If either condition is false, the
accumulator ends up positive, and the email is junked.

###########################################################
:0
*   2^0
*  -1^0  
^Message-Id:.*[<](_dot_)(_dot_)*(_at_)(_dot_)(_dot_)*\(_dot_)(_dot_)*[>]
*  -1^0 !^Message-Id:(.*$)+Message-Id:
| echo "X-Reject: Did not have exactly 1 Message-Id:">>junkmail
;cat - >>junkmail
###########################################################

Now comes the beautiful/ugly part.  If a message is *NOT* a
relay, then the following formula should work...

  The total number of "Received:" headers sum of
    a) # of "Received:" headers showing message being
       passed around in the sender's ISP.
    b) *EXACTLY* 1 "Received:" header where the sender is
       handing off the message to my ISP.
    c) # of "Received:" headers bouncing around between my
        ISP's machines.

  If there are any extra "Received:" headers, then we know that
it was either...
  1) a relay spam, or
  2) fake "Receieved:" headers were added
...and in either case, we don't want the message.
  So how do we figure out the sender's domain?  A good partial
domain can be obtained from the "Message-Id:".  Remember all
the trouble we went to to insure that it was legit?  I've
deliberately left off the extension, to handle the special case
of domains with more than one extension.

###########################################################
:0
*  ^Message-Id:.*[<](_dot_)(_dot_)*(_at_)\/\.
{ }
SENDER=$MATCH

:0
*   1^1  ^Received:.from
*$ -1^1  ^Received:.from.*$MYISP.*.by.*.$MYISP
*$ -1^0  ^Received:.from.*$SENDER.*.by.*.$MYISP
*$ -1^1  ^Received:.from.*$SENDER.*.by.*.$SENDER
| echo "X-Reject: Relay or forged headers">>junkmail ;cat -
junkmail
###########################################################

Notes: ancient/obsolete/broken mailservers that don't rDNS are
automatically rejected by this filter.  Even ones that put out
non-standard output, like Microsoft...
1...2...3...aaaaaaaaaaaaawwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww

I ran this against 1.5 megs of spam, and it seems to work.  I'm
hopeful that I've found the key.  Comments, suggestions?

-- 
Walter Dnes (Toronto)
<waltdnes(_at_)interlog(_dot_)com>