procmail
[Top] [All Lists]

Re: Spam: Are You In Need Of A Lifestyle Change

1997-09-29 02:08:53
On Sun, 28 Sep 1997 22:44:05 -0500 (CDT),
Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com> wrote:
 >> :0
 >> * ^Subject:.*are\ you\ in\ need\ of\ a\ life
 >> /dev/null

It bears pointing out that you don't need to backslash-escape the
spaces. Might be a good idea, though, to try this (due to Rik Kabel): 

Careful on whom you are quoting...I did not write the 
 "* ^Subject:.*are\ you\ in\ need\ of\ a\ life"
material, although since you deleted the original poster's name, it looks
as though I did...



 >> Received: from ctcpXzPDJ  (dd30-242.dub.compuserve.com [199.174.147.242])

The simple fact that the stuff in the parens don't match what the
sender said are already a good clue. It happens a lot on legitimate
mail but it's a good thing to include in a scoring recipe. 

It is a so-so clue, at best - "It happens a lot on legitimate mail" says it all.
Knowing that it usually happens on compuserve.com, att.com, psi.net, uu.net, 
and a few others might make for a better recipe...


 >> Message-ID: <BrS5>
 > Does anyone have a good Message-Id: recipe?  I came up with one that
 > validated Sendmail Message-Id's, but programs like Pine and qmail have
 > their own variations that break this.
 > * ^Message-Id: (<>|<none>|0000000000.\AAA000)
 > catches the obvious fakes, but not ids such as "BrS5"

Here's what I've been using. There is software out there that breaks
RFC822 in that they don't include an "@" in the Message-Id. I don't
care too much since I see them in my spam tank but if you send stuff
to /dev/null, you'll probably want to take out the @ part. 

:0
* ! ^Message-Id:[     ]*<[^   <>@]+(_at_)[^   <>@]+>[         ]*$
{ REJECT="$REJECT${REJECT:+$NL}${REJ}No valid Message-Id" }


 >> Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by 
utp.net;Sat,
 >                                           ^^    ^^^        ^^
 > Oops!  IP (IPv4) numbers are 8 bit value (0-255)...333 is no good.  There 
is a
 > recipe for this type of fakery, but I don't have ready access to it
 > at the moment.   Can someone repost it?

I only have badly working ones on file. The primary problem with these
is that there will be other numbers in those headers which look a lot
like IP numbers unless you preparse them a little bit (for instance,
Microsoft Mail Server Received: lines contain a version number which
is something like 4.0.994.63) but you can get pretty far by looking
only at Received: lines which are more or less like what Sendmail
generates and see if there's a "reverse lookup" number which looks
faked. The general format of these is 


I thought about this one for a while, and after finding a couple of
bad ones (probably the same ones you have!), I came up with:

# Tag this spam if the Internet IPv4 address in Received: is either:
#    a: the first octet is 0, 0[0-9], 0[0-9][0-9], or >255       
#    b: any of the other octets are 0[0-9], 0[0-9][0-9], or >255       
:0
* ^Received: (.*(\[|\()(0[0-9]?[0-9]?|25(6|7|8|9)|2[6-9][0-9]|[3-9][0-9][0-9])|\
              .*\.(0[0-9][0-9]?|25(6|7|8|9)|2[6-9][0-9]|[3-9][0-9][0-9])) 


I think it is pretty safe for legit email (including your Microsoft example 
above)
and it should catch most obvious spam (incluing the above).  Since I just
wrote this, it might still need some fine tuning, though...



 > Where "MyEmailAddress" is replaced by your email address(es).  By dumping 
 > everything that is not specifically addressed to you to a non-default
 > folder, you virtually eliminate all spam that escapes your other filters.
 > This is after you filter out mailing lists and such, of course.

This is dubious advice, but you probably know that already. Some
people receive legitimate BCC:s, others don't. 

Dubious?  If you get a lot of BCC:'ed mail then it obvious isn't a good idea, 
but 
for the majority of people, I'm willing to bet they receive 99 spams for
every legit BCC:'ed email.  In any event, sending the BCC:'ed email to a 
separate folder doesn't hurt, and it stops spam cold when all of your other
filters fail.
 

 > address, this spam actually has fairly clean headers.  It should have still

Huh? It's +terribly+ forged. Most of the Received: headers will always

Well *of course* it is forged, but compared to the majority of spam that I see, 
the headers are rather innocent looking:

  * No Comments: Authenticated sender is...
  * No X-Advertisement: or friends
  * No X-PMFLAGS:
  * The X-UIDL: is "valid" (correct length, not all numbers)
  * The To: field wasn't something like "friend" or "you&I" or @public.com


That is what I classify as clean spam!  :)



Jeff Thieleke