an anti-spam procmail recipe

After doing spam filtering using the usual criteria (X-Advertisment:,
Cyberpromo blocking, known spammer domains, Friend(_at_)public(_dot_)com, 
etc.), some
spammers with slightly above-average (for a spammer) intelligence were still
getting through my filters. Eventually, I started noticing a trend in a lot
of spam: the To: header would exactly match the From: header. Unfortunately,
some recent spam seems to be going away from this trend, but I still trap a
fair amount using this technique, so I thought I'd share it with you. After I
implemented this a few weeks ago, the amount of spam that got past my filters
went to zero. It also hasn't misfiled any e-mail, so it seems pretty
successful to me. Here it is:


# First, a few definitions (I got most of these from postings to the procmail
# mailing list over the years):
PRE_ADDR_SPAN='(.*[^-((_dot_)%(_at_)a-zA-Z0-9])?'
IN_ADDR_SPAN='([^,.>    ]+\.)?'
FROMHDR="(^((((Resent|Apparently)-)?From|Sender|Reply-To|(X-)?Envelope-From):|>?From
 )$PRE_ADDR_SPAN)"

# Next, define a regexp that matches all of your valid e-mail addresses
# and another that matches your domain name(s)
MY_DOMAINS="(($IN_ADDR_SPAN)*your\.domain\.name|somewhere-else\.net)"
MY_NAMES="youruserid(@$MY_DOMAINS)?"

# Initialize variables...
TO_VALUE        # Insure that TO_VALUE is unset.
FROM_VALUE      # Insure that FROM_VALUE is unset.

# E-mails where the To: and From: headers match but it's not To: or From: me
# or somebody from my domain are probably spam.
:0
* ^To:[         ]*\/[^  ].*
{
    TO_VALUE = $MATCH

    :0
    * ^From:[   ]*\/[^  ].*
    {
        FROM_VALUE = $MATCH

        :0:
        * TO_VALUE ?? .
        * FROM_VALUE ?? .
        * $ ! ^TO($MY_NAMES)
        * $ ! $FROMHDR($MY_NAMES|[^(_at_)]+@$MY_DOMAINS)
        * $ FROM_VALUE ?? ^^$\TO_VALUE^^
        mbox.spam
    }
}


As always, that's a space and a tab inside those "[     ]" and "[^      ]".

Later,
Ed