procmail
[Top] [All Lists]

Re: Juno/hotmail/prodigy filtering

1998-01-02 14:57:59
On Fri, 02 Jan 1998 00:55:04 -0500, Walter Dnes
<waltdnes(_at_)interlog(_dot_)com> wrote:
Professional Software Engineering wrote:
if the Message-ID doesn't contain the proper domain,
then the message is dumped as a spam:
  1) match the first whole-word (including punctuation) after
the "@" on the "From: " header.  E.g. from
"From: someone(_at_)some(_dot_)isp(_dot_)com (someone)" extract 
"some.isp.com"
  2) look for MATCH ("some.isp.com") in *BOTH* the message ID
and the "Received: " headers.  If not found in both, it's
probably junk.

In addition to problems already pointed out, this will often give you
a host.domain.com where you would actually really like to get only
domain.com, with various pesky variations. It is not hard to trim it
down to just domain.com and then try a match on that ... in fact I
thought I had already posted this a couple of months ago, but I
couldn't find it in the archives:

SUSPECT="(aol|hotmail)\.com"

    # From: domain not in Received: lines? Bad if already suspect
    :0
    * ^From:(_dot_)*(_at_)\/[^<>@,      ]+
    * $ MATCH ?? $SPAMMERS|$SUSPECT
    {
        DOMAIN="$MATCH"
        :0  # strip down to two-level if necessary (moo.xx.net -> xx.net)
        * DOMAIN ?? (\.)\/[^.]+\.[^.]+^^
        { DOMAIN="$MATCH" }

        :0
        * $ ! ^Received: from [^        ]*$\DOMAIN
        { ... reject it ... }
    }

SPAMMERS contains another list of fun domain names I don't accept any
mail from, including semi-legit domains such as Compu$erve and Netcom.

Obviously, this does something slightly different from what you
describe but it is easily adapted.

As discussed on Spam-L recently, some regional top-level domains such
as au, jp and uk have an additional component which needs to be kept
(you should trim down only to demon.co.uk, not to the midlevel co.uk)
but that's largely beside the point here. (If anybody would happen to
have a list of such toplevel domains, I'd love to see it, though!)

/* era */

(The message I think I posted before might still be in the archives;
the search engine at Rosat is absolutely hopeless to use ... or am I
just stupid? The documentation refers to how to operate it from the
command line but the web interface works differently and defies all my
attempts to entering a multi-word phrase or anything with funny
characters in it.)

-- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>

<Prev in Thread] Current Thread [Next in Thread>