procmail
[Top] [All Lists]

Re: procmail-d Digest V97 #297

1997-09-29 04:39:46
This is my mail filter.

Your mail either came from a "spam haven" site (a site with a
serious and ongoing problem with spamming), came from a site
which offers free email accounts (which spammers unfortunately
love and use often -- a more detailed explanation can be found
below), or had headers which the filter's pattern matching tagged 
as probable spam, so my mail filter intercepted and deleted it.

If you are a bulk mailer, advertiser (commercial or political), or 
are sending any kind of "free offer", please remove this address
from your mailing list and go away.  I do not welcome unsolicited
advertising mail of any kind or for any purpose whatsoever.  
Please respect my privacy.

If you are not an advertiser, and are sending from a site which
is blocked in my filter, or if you have no idea what happened,
you can resend your mail and get past the filters by including the
password listed below on the Subject: line of your message.  But
do this only if you have legitimate, personal business with me.
Any spammer foolish enough to use this password to spam me in 
clear violation of my stated wishes will regret it.

Thank you!

 
In this case, the filters spotted some typical spam phrases
in the body of your message.  Please use the password below
to get past this filter.
 
 ********** The password is bypfilter. **********


=-=-=-=-=-=-=-=-=-=


From procmail-d-request(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE  Mon Sep 29 
07:29:07 1997
Received: from Campino.Informatik.RWTH-Aachen.DE 
(campino.Informatik.RWTH-Aachen.DE [137.226.116.240]) by aries.ai.net 
(8.8.3/8.6.12) with ESMTP id HAA17306 for <biow(_at_)ezmort(_dot_)com>; Mon, 29 
Sep 1997 07:28:57 -0400 (EDT)
From: procmail-d-request(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Received: (from lists(_at_)localhost) by Campino.Informatik.RWTH-Aachen.DE 
(8.8.7/RBI-Z13) id MAA03445; Mon, 29 Sep 1997 12:28:55 +0200 (MET DST)
Date: Mon, 29 Sep 1997 12:28:55 +0200 (MET DST)
Message-Id: 
<199709291028(_dot_)MAA03445(_at_)Campino(_dot_)Informatik(_dot_)RWTH-Aachen(_dot_)DE>
X-Authentication-Warning: campino.informatik.rwth-aachen.de: lists set sender 
to procmail-d-request(_at_)informatik(_dot_)rwth-aachen(_dot_)de using -f
Subject: procmail-d Digest V97 #297
X-Loop: procmail-d(_at_)informatik(_dot_)rwth-aachen(_dot_)de
X-Mailing-List: <procmail-d(_at_)informatik(_dot_)rwth-aachen(_dot_)de> 
archive/volume97/297
Precedence: list
MIME-Version: 1.0
Content-Type: multipart/digest; boundary="----------------------------"
To: procmail-d(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Reply-To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE

------------------------------

Content-Type: text/plain

procmail-d Digest                               Volume 97 : Issue 297

Today's Topics:
  Spam: Are You In Need Of A Lifestyle  [ ftilley(_at_)goodnet(_dot_)com (Felix 
Tilley) ]
  Re: Spam: Are You In Need Of A Lifes  [ Timothy J Luoma 
<luomat+procmail(_at_)lu ]
  Re: Spam: Are You In Need Of A Lifes  [ Jeff Thieleke 
<thieleke(_at_)ix(_dot_)netcom(_dot_)c ]
  Re: Spam: Are You In Need Of A Lifes  [ era eriksson <era(_at_)iki(_dot_)fi> ]
  Re: Spam: Are You In Need Of A Lifes  [ "J. Daniel Smith" 
<J(_dot_)Daniel(_dot_)Smith(_at_)W ]
  Re: Spam: Are You In Need Of A Lifes  [ Jeff Thieleke 
<thieleke(_at_)ix(_dot_)netcom(_dot_)c ]
  IP number checking (was Re: Spam: Ar  [ era eriksson <era(_at_)iki(_dot_)fi> ]
  [delete junk mail] How to delete a m  [ Vu Quoc Hu`ng 
<hung(_at_)vsb(_dot_)cz> ]

------------------------------

Date: Sun, 28 Sep 1997 15:53:01 -0700
From: ftilley(_at_)goodnet(_dot_)com (Felix Tilley)
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE, 
tom(_at_)hughes(_dot_)com
Subject: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <199709282249(_dot_)PAA11186(_at_)mail(_dot_)goodnet(_dot_)com>
Content-Type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: 8bit

(A copy of this message has also been posted to the following newsgroups:
news.admin.net-abuse.email, comp.mail.misc)

I do not want anymore crap from these jerks.

:0
* ^Subject:.*are\ you\ in\ need\ of\ a\ life
/dev/null

Felix

===========================================================



Received: from mustang.via.net (mustang.via.net [140.174.204.4])
        by mail.goodnet.com (8.8.7/8.8.6) with SMTP id MAA03994
        for <ftilley(_at_)goodnet(_dot_)com>; Sat, 27 Sep 1997 12:07:26 -0700 
(MST)
From: N8dx1k7gM(_at_)unlimited(_dot_)net
Received: from ctcpXzPDJ  (dd30-242.dub.compuserve.com [199.174.147.242])
by mustang.via.net (8.6.9/8.6.9) with SMTP id LAA28431; Sat, 27 Sep 1997
11:45:38 -0700
DATE: 27 Sep 97 3:18:56 PM
Reply-to: PRR(_at_)UTP(_dot_)NET
Message-ID: <BrS5>
Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by utp.net;Sat,
27 Sep 1997 15:18:56 -400 (EDT)
TO:
.............................................................................
(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_at_)mustang(_dot_)via(_dot_)net
SUBJECT: Are You In Need Of A Lifestyle Change...
X-UIDL: f1243434ba24adc40b99deff8469afa3
Status: O
X-Status: 

Now for the first time ever you have the opportunity to join the most
extraordinary and most powerful wealth building program in the world!
This program has never been offered to the general public until now! 
Because of your desire to succeed, you have been given the opportunity
to take a close look at this program.


[rest of crap deleted]

If you prefer to reply via email, my real address is below:

|---------------------------------|
| Note that From: line has been   |
| altered to foil email spammers. |
|                                 |
| %%% Felix Tilley                |
| %%% ftilley(_at_)%%%goodnet%(_dot_)com     |
|---------------------------------|

------------------------------

Date: Sun, 28 Sep 97 19:39:20 -0400
From: Timothy J Luoma <luomat+procmail(_at_)luomat(_dot_)peak(_dot_)org>
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <199709282339(_dot_)TAA07700(_at_)luomat(_dot_)peak(_dot_)org>
Content-Type: text/plain

        Author:        ftilley(_at_)goodnet(_dot_)com (Felix Tilley)
        Original-Date: Sun, 28 Sep 1997 15:53:01 -0700
        Message-ID:    
<199709282249(_dot_)PAA11186(_at_)mail(_dot_)goodnet(_dot_)com>

I do not want anymore crap from these jerks.

Tell me again why I need an update on your killfile?


Real-Email-Address: I am ftilley(_at_)goodnet(_dot_)com (Felix Tilley)

This should be an X-Header

TjL

------------------------------

Date: Sun, 28 Sep 1997 22:44:05 -0500 (CDT)
From: Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com>
To: ftilley(_at_)goodnet(_dot_)com (Felix Tilley)
Cc: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <199709290344(_dot_)WAA23488(_at_)ix(_dot_)netcom(_dot_)com>
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

(A copy of this message has also been posted to the following newsgroups:
news.admin.net-abuse.email, comp.mail.misc)

I do not want anymore crap from these jerks.

:0
* ^Subject:.*are\ you\ in\ need\ of\ a\ life
/dev/null

You *could* do this, but how likely are you to get this type
of subject again?  In any case, /dev/null'ing a non-spam specific
item is fairly dangerous.

I would suggest poking around in the header a bit more...


 
Received: from mustang.via.net (mustang.via.net [140.174.204.4])
        by mail.goodnet.com (8.8.7/8.8.6) with SMTP id MAA03994
        for <ftilley(_at_)goodnet(_dot_)com>; Sat, 27 Sep 1997 12:07:26 -0700 
(MST)
From: N8dx1k7gM(_at_)unlimited(_dot_)net
Received: from ctcpXzPDJ  (dd30-242.dub.compuserve.com [199.174.147.242])
                 ^^^^^^^^???
by mustang.via.net (8.6.9/8.6.9) with SMTP id LAA28431; Sat, 27 Sep 1997
11:45:38 -0700
DATE: 27 Sep 97 3:18:56 PM
Reply-to: PRR(_at_)UTP(_dot_)NET
Message-ID: <BrS5>

Does anyone have a good Message-Id: recipe?  I came up with one that
validated Sendmail Message-Id's, but programs like Pine and qmail have
their own variations that break this.

* ^Message-Id: (<>|<none>|0000000000.\AAA000)
catches the obvious fakes, but not ids such as "BrS5"



Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by utp.net;Sat,
                                          ^^    ^^^        ^^
Oops!  IP (IPv4) numbers are 8 bit value (0-255)...333 is no good.  There is a
recipe for this type of fakery, but I don't have ready access to it
at the moment.   Can someone repost it?



27 Sep 1997 15:18:56 -400 (EDT)
TO:
.............................................................................
(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)(_at_)mustang(_dot_)via(_dot_)net


I *highly* recommend ending your .procmailrc with something like:

:0:
* ^TO.*MyEmailAddress(@)?
| formail -A"X-Sorted: To my email address" >>$DEFAULT

:0:
| formail -A"X-Sorted: Blocked - fell through .procmailrc" >>$BLOCKFOLDER


Where "MyEmailAddress" is replaced by your email address(es).  By dumping 
everything that is not specifically addressed to you to a non-default
folder, you virtually eliminate all spam that escapes your other filters.
This is after you filter out mailing lists and such, of course.


SUBJECT: Are You In Need Of A Lifestyle Change...
  ^^^^^^^
I have noticed that a lot of spam has all capitial letters for
To:, From:, and Subject:.  Do any legitimate mail agents produce
such output?



X-UIDL: f1243434ba24adc40b99deff8469afa3
Status: O
X-Status: 

Now for the first time ever you have the opportunity to join the most
extraordinary and most powerful wealth building program in the world!
This program has never been offered to the general public until now! 
Because of your desire to succeed, you have been given the opportunity
to take a close look at this program.


[rest of crap deleted]

I wish you would have posted the rest.  In recent testing, I have
found that I can catch about 90% of my spam with these simple body searches:


:0 B
* 
(("remove"|remove)(.*^?.*)in(.*^?.*)the(.*^?.*)subject(.*^?.*)(field|line|header)?|\
   reply(.*^?.*)with(.*^?.*)the(.*^?.*)subject(.*^?.*)("remove"|remove)|\
   ("remove"|remove)(.*^?.*)on(.*^?.*)subject(.*^?.*)(field|line|header)?|\
   removed(.*^?.*)from(.*^?.*)our(.*^?.*)(mailing list|database))
{
 :0:
 | $FORMAIL -A"X-Sorted: *** SPAM! - Remove _THIS_!!! ***" >> $SPAMFOLDER
}

  
# Case SENSITIVE body check
:0 BD
* (GUARANTEED|FREE (OFFER|BONUS)|CREDIT|\
   LEGAL(LY)?|SECRETS|BULK EMAIL|CLICK NOW|\
   ORDER FORM|NO RISK|(MAKE|MAKING) MONEY|MLM)
{
 :0:
 | $FORMAIL -A"X-Sorted: *** SPAM! - Case sensitive keyword found in body of 
message ***" >>$SPAMFOLDER
}


# Case INSENSITIVE body check
:0 B
* (This(.*^?.*)is(.*^?.*)a(.*^?.*)one(.*^?.*)time(.*^?.*)mailing|\
   (You)?(.*^?.*)must(.*^?.*)be(.*^?.*)(over|at least)(.*^?.*)(18|21)|\
   No Credit Checks|\
   answerme\.com|\
   savetrees\.com|\
   (make|making) money (fast)?|\
   limited time offer|\
   send \$.* to|\
   order now)
{
 :0:
 | $FORMAIL -A"X-Sorted: *** SPAM! - Case insensitive phrase found in body of 
message ***" >>$SPAMFOLDER
}


In this case, I think a case insensitive search for "first time ever" or 
"wealth building program" would be a pretty safe bet.  However, I would rather 
catch spam with bogus header searches.  Unfortunately, except for the "333" IP
address, this spam actually has fairly clean headers.  It should have still
been blocked by the last recipe in your .procmailrc, however.

  

Jeff Thieleke

------------------------------

Date: Mon, 29 Sep 1997 09:23:36 +0300 (EET DST)
From: era eriksson <era(_at_)iki(_dot_)fi>
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
CC: ftilley(_at_)goodnet(_dot_)com
Subject: Re: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <199709290623(_dot_)JAA16531(_at_)kontti(_dot_)Helsinki(_dot_)FI>
Content-Type: text/plain; charset=US-ASCII

On Sun, 28 Sep 1997 22:44:05 -0500 (CDT),
Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com> wrote:
:0
* ^Subject:.*are\ you\ in\ need\ of\ a\ life
/dev/null

It bears pointing out that you don't need to backslash-escape the
spaces. Might be a good idea, though, to try this (due to Rik Kabel): 

  * ^Subject:.*are\<*you\<*in\<*need\<*of\<*a\<*life

(BTW, wouldn't \<+ work better?)
  In my own recipes, I've used "life style" alone as a good sign that
a message is spam. (Other good ones include debt, phone card, long
distance, xxx, adult, etc, as well as the obvious MLM and FREE. A
friend of mine made the observation that a lot of spam and very little
legit mail includes the word "you" in the subject but I'm too chicken
to try that :-)

From: N8dx1k7gM(_at_)unlimited(_dot_)net

I've been thinking about ways to catch these. They're fairly obvious
to the human eye but hard to pin down in any meaningful way. Ideas,
anyone? Also note that there is no "unlimited.net" anywhere in the
Received: lines. (Shouldn't a strictly conformant message have a
Sender: with the real sender ID if you're overriding From: and if so,
how many are doing this in practice?)

Received: from ctcpXzPDJ  (dd30-242.dub.compuserve.com [199.174.147.242])
                 ^^^^^^^^???
by mustang.via.net (8.6.9/8.6.9) with SMTP id LAA28431; Sat, 27 Sep 1997

The simple fact that the stuff in the parens don't match what the
sender said are already a good clue. It happens a lot on legitimate
mail but it's a good thing to include in a scoring recipe. 

Message-ID: <BrS5>
Does anyone have a good Message-Id: recipe?  I came up with one that
validated Sendmail Message-Id's, but programs like Pine and qmail have
their own variations that break this.
* ^Message-Id: (<>|<none>|0000000000.\AAA000)
catches the obvious fakes, but not ids such as "BrS5"

Here's what I've been using. There is software out there that breaks
RFC822 in that they don't include an "@" in the Message-Id. I don't
care too much since I see them in my spam tank but if you send stuff
to /dev/null, you'll probably want to take out the @ part. 

:0
* ! ^Message-Id:[       ]*<[^   <>@]+(_at_)[^   <>@]+>[         ]*$
{ REJECT="$REJECT${REJECT:+$NL}${REJ}No valid Message-Id" }


Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by utp.net;Sat,
                                          ^^    ^^^        ^^
Oops!  IP (IPv4) numbers are 8 bit value (0-255)...333 is no good.  There is 
a
recipe for this type of fakery, but I don't have ready access to it
at the moment.   Can someone repost it?

I only have badly working ones on file. The primary problem with these
is that there will be other numbers in those headers which look a lot
like IP numbers unless you preparse them a little bit (for instance,
Microsoft Mail Server Received: lines contain a version number which
is something like 4.0.994.63) but you can get pretty far by looking
only at Received: lines which are more or less like what Sendmail
generates and see if there's a "reverse lookup" number which looks
faked. The general format of these is 

  Received: from hostA by hostB (hostC [IP number])

but you'd have to find an efficient way to fish out the IP number from
all of them and look at each. (The semi-obvious Procmail-only solution
ends up looking only at the first one. You could make it look at only
the last one instead and be fairly safe that this is almost always
right, but it all smacks of kludgery in the end. Anybody have an
elegant solution?)
  Note that you'd generally want hostB and hostC to be more or less
the same, but you can't dump a message merely because they don't
match. For one thing, you often see host names with aliases (i.e.
moo.net (mail.moo.net [123.45.67.89]) and even a.net (b.com))
  Another thing I've been trying somewhat unsuccessfully to match is
the fairly common spammer trick to say HELO receivinghost resulting in
Received: from receivinghost (otherhost [blah blah]) by receivinghost 
but that's not a sure sign it's spam, either.

Where "MyEmailAddress" is replaced by your email address(es).  By dumping 
everything that is not specifically addressed to you to a non-default
folder, you virtually eliminate all spam that escapes your other filters.
This is after you filter out mailing lists and such, of course.

This is dubious advice, but you probably know that already. Some
people receive legitimate BCC:s, others don't. 

address, this spam actually has fairly clean headers.  It should have still

Huh? It's +terribly+ forged. Most of the Received: headers will always
look more or less legitimate because they're added by legitimate
software. One faked Received: line and you're dead in my book, though.
Also note that the domain on the To: address was added at
mustang.via.net. Could have been forwarded to you from there, but
that's another thing to hang on to. Finally, I've been thinking about
a recipe to catch the situation where there's a From: and a Reply-To:
but neither appears in any legit Received: lines. (Okay, if you had
missed the fact that the final Received: is fake, this one would have
slipped through that crack, but not a Message-Id sanity check.) 
  Also, Felix, did your local software add the X-Uidl header or was it
in the spam itself?

/* era */

-- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>

------------------------------

Date: Mon, 29 Sep 1997 09:54:10 +0100 (WET DST)
From: "J. Daniel Smith" <J(_dot_)Daniel(_dot_)Smith(_at_)WriteMe(_dot_)com>
To: era eriksson <era(_at_)iki(_dot_)fi>
Cc: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE, 
ftilley(_at_)goodnet(_dot_)com
Subject: Re: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <19970929095410(_dot_)822014(_dot_)FMU10042(_at_)handel>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

era eriksson writes on 29 September 1997 at 09:23:36
From: N8dx1k7gM(_at_)unlimited(_dot_)net

I've been thinking about ways to catch these. They're fairly obvious
to the human eye but hard to pin down in any meaningful way. Ideas,

I've got recipies to deal with more obvious ones, but this name is (or
could be) a legitimate user-id on some systems.  Perhaps someone with
experience on these systems (mainframes, Prodigy, etc.) can tell us
how these cyrptic (to us at least) logins are formed...maybe there's a
pattern we could use...

To keep this thread more on the topic of procmail and less on SPAM :-),
here's what I use to validate email addresses
  
FROM=${FROM:-"(^((X-(Envelope-)?)?(Apparently-|Resent-)*(From|Reply-To|Sender):\
(.*[^-a-z0-9_])?|From ([^       ]*[-_(_at_)!(_dot_)])?))"}
  # Don't accept all syntactically valid addresses; who's going to have 
  # a real email address of "foo_(_at_)-bar-(_dot_)com"?
  spamcheck_word="[a-z0-9][-a-z0-9_.+]*[a-z0-9]+"
  spamcheck_tld="(com|gov|org|edu|net|int|[a-z][a-z])"
  spamcheck_email="\<${spamcheck_word}@(${spamcheck_word}\.)+${spamcheck_tld}\>"
  :0h
  * $^TO${spamcheck_email}
  * $${FROM}${spamcheck_email}
  { }
  :E
  { ... deal with spam here ... }
Note that the address in question would pass this check.

Where "MyEmailAddress" is replaced by your email address(es).  By dumping 
everything that is not specifically addressed to you to a non-default
[...]
This is dubious advice, but you probably know that already. Some
people receive legitimate BCC:s, others don't. 

I've hooked this up in my "spamcheck" recipes, giving anything that
matches a slight non-spam weight
  SPAMCHECK_ME_RE=${SPAMCHECK_ME_RE:-$LOGNAME}
  :0
  * $^TO\/${SPAMCHECK_ME_RE}
  {
    spamcheck_contribution=${SPAMCHECK_TOME_SCORE:-"-$SPAMCHECK_20"}
    spamcheck_reason="TOME - explicit recipient: $MATCH"
    spamcheck_rcpath=$_
    INCLUDERC=$SPAMCHECK_RCDIR/reason.rc
  }

 Also, Felix, did your local software add the X-Uidl header or was it
in the spam itself?

I've found that the X-Uidl: and Pegasus MUA checks below catch an
awful lot of spam.

  # Might need to be a little more particular here; 
  # Philip Guenther <guenther(_at_)gac(_dot_)edu>: If a message comes into your
  # mailbox that has the X-UIDL: header, and doesn't have your address in
  # the header, then I would have strong doubts about it's legitamacy. 
  #
  # Edward J. Sabol <sabol(_at_)alderaan(_dot_)gsfc(_dot_)nasa(_dot_)gov>: 
E-mails with
  # X-UIDL: headers are almost definitely spam unless they've been
  # Resent-To: me by someone. Also, valid X-UIDL: headers have 32 hexadecimal
  # digits exactly.
  :0
  * ^X-UIDL:
  * !^X-UIDL:[  ]*[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]\
                  [0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]\
                  [0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]\
                  [0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]\
                  [0-9a-f][0-9a-f][0-9a-f][0-9a-f][     ]*$
  * !^Resent-To:
  { ... spam processing ... }

  # From: Gregory Sutter <gsutter(_at_)ugems(_dot_)psu(_dot_)edu>
  # Pegasus mailer is the only mailer which legitimately generates
  # "Comments: Authenticated sender is ..." so kill anything else.
  :0
  * ^Comments:.*Authenticated sender
  * !^X-Mailer:.*Pegasus Mail
  * !^Resent-To:
  {
    # can such mail *ever* be legit?
    ... spam processing ...
  }

   Dan
------------------- message is author's opinion only ------------------
J. Daniel Smith <DanS(_at_)bristol(_dot_)com>        
http://www.bristol.com/~DanS
Bristol Technology B.V.                   +31 33 450 50 50, ...51 (FAX)
Amersfoort, The Netherlands               {info,jobs}(_at_)bristol(_dot_)com

------------------------------

Date: Mon, 29 Sep 1997 04:01:35 -0500 (CDT)
From: Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com>
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: Spam: Are You In Need Of A Lifestyle Change
Message-Id: <199709290901(_dot_)EAA27704(_at_)ix(_dot_)netcom(_dot_)com>
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Sun, 28 Sep 1997 22:44:05 -0500 (CDT),
Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com> wrote:
 >> :0
 >> * ^Subject:.*are\ you\ in\ need\ of\ a\ life
 >> /dev/null

It bears pointing out that you don't need to backslash-escape the
spaces. Might be a good idea, though, to try this (due to Rik Kabel): 

Careful on whom you are quoting...I did not write the 
 "* ^Subject:.*are\ you\ in\ need\ of\ a\ life"
material, although since you deleted the original poster's name, it looks
as though I did...



 >> Received: from ctcpXzPDJ  (dd30-242.dub.compuserve.com [199.174.147.242])

The simple fact that the stuff in the parens don't match what the
sender said are already a good clue. It happens a lot on legitimate
mail but it's a good thing to include in a scoring recipe. 

It is a so-so clue, at best - "It happens a lot on legitimate mail" says it all.
Knowing that it usually happens on compuserve.com, att.com, psi.net, uu.net, 
and a few others might make for a better recipe...


 >> Message-ID: <BrS5>
 > Does anyone have a good Message-Id: recipe?  I came up with one that
 > validated Sendmail Message-Id's, but programs like Pine and qmail have
 > their own variations that break this.
 > * ^Message-Id: (<>|<none>|0000000000.\AAA000)
 > catches the obvious fakes, but not ids such as "BrS5"

Here's what I've been using. There is software out there that breaks
RFC822 in that they don't include an "@" in the Message-Id. I don't
care too much since I see them in my spam tank but if you send stuff
to /dev/null, you'll probably want to take out the @ part. 

:0
* ! ^Message-Id:[     ]*<[^   <>@]+(_at_)[^   <>@]+>[         ]*$
{ REJECT="$REJECT${REJECT:+$NL}${REJ}No valid Message-Id" }


 >> Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by 
utp.net;Sat,
 >                                           ^^    ^^^        ^^
 > Oops!  IP (IPv4) numbers are 8 bit value (0-255)...333 is no good.  There 
is a
 > recipe for this type of fakery, but I don't have ready access to it
 > at the moment.   Can someone repost it?

I only have badly working ones on file. The primary problem with these
is that there will be other numbers in those headers which look a lot
like IP numbers unless you preparse them a little bit (for instance,
Microsoft Mail Server Received: lines contain a version number which
is something like 4.0.994.63) but you can get pretty far by looking
only at Received: lines which are more or less like what Sendmail
generates and see if there's a "reverse lookup" number which looks
faked. The general format of these is 


I thought about this one for a while, and after finding a couple of
bad ones (probably the same ones you have!), I came up with:

# Tag this spam if the Internet IPv4 address in Received: is either:
#    a: the first octet is 0, 0[0-9], 0[0-9][0-9], or >255       
#    b: any of the other octets are 0[0-9], 0[0-9][0-9], or >255       
:0
* ^Received: (.*(\[|\()(0[0-9]?[0-9]?|25(6|7|8|9)|2[6-9][0-9]|[3-9][0-9][0-9])|\
              .*\.(0[0-9][0-9]?|25(6|7|8|9)|2[6-9][0-9]|[3-9][0-9][0-9])) 


I think it is pretty safe for legit email (including your Microsoft example 
above)
and it should catch most obvious spam (incluing the above).  Since I just
wrote this, it might still need some fine tuning, though...



 > Where "MyEmailAddress" is replaced by your email address(es).  By dumping 
 > everything that is not specifically addressed to you to a non-default
 > folder, you virtually eliminate all spam that escapes your other filters.
 > This is after you filter out mailing lists and such, of course.

This is dubious advice, but you probably know that already. Some
people receive legitimate BCC:s, others don't. 

Dubious?  If you get a lot of BCC:'ed mail then it obvious isn't a good idea, 
but 
for the majority of people, I'm willing to bet they receive 99 spams for
every legit BCC:'ed email.  In any event, sending the BCC:'ed email to a 
separate folder doesn't hurt, and it stops spam cold when all of your other
filters fail.
 

 > address, this spam actually has fairly clean headers.  It should have still

Huh? It's +terribly+ forged. Most of the Received: headers will always

Well *of course* it is forged, but compared to the majority of spam that I see, 
the headers are rather innocent looking:

  * No Comments: Authenticated sender is...
  * No X-Advertisement: or friends
  * No X-PMFLAGS:
  * The X-UIDL: is "valid" (correct length, not all numbers)
  * The To: field wasn't something like "friend" or "you&I" or @public.com


That is what I classify as clean spam!  :)



Jeff Thieleke

------------------------------

Date: Mon, 29 Sep 1997 12:38:18 +0300 (EET DST)
From: era eriksson <era(_at_)iki(_dot_)fi>
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Cc: thieleke(_at_)ix(_dot_)netcom(_dot_)com
Subject: IP number checking (was Re: Spam: Are You In Need Of A Lifestyle 
Change)
Message-Id: <199709290938(_dot_)MAA12577(_at_)kontti(_dot_)Helsinki(_dot_)FI>
Content-Type: text/plain; charset=US-ASCII

On Mon, 29 Sep 1997 09:23:36 +0300 (EET DST), I wrote:
Jeff Thieleke <thieleke(_at_)ix(_dot_)netcom(_dot_)com> wrote:
Received: From mailhost.UTP.net(alt1.utp..net(333.2.44.55)) by utp.net;
                                         ^^    ^^^        ^^
Oops! IP (IPv4) numbers are 8 bit value (0-255)...333 is no good.
There is a recipe for this type of fakery, but I don't have ready
access to it at the moment. Can someone repost it?
I only have badly working ones on file. The primary problem with these
is that there will be other numbers in those headers which look a lot
like IP numbers unless you preparse them a little bit (for instance,

Blah blah. Try this: 

  * ^Received: from [^[( ]+ ?[[(]?(([a-z][-a-z0-9._]*)* ?)? ?[[(]\
        ((0|1?[1-9][0-9]?|2[0-4][0-9]|25[0-5])\.)*\
        (25[6-9]|[3-9][0-9][0-9]|[1-9][0-9][0-9][0-9]|0[])])

Paraphrase: from hostA (hostB [((valid IP numbers)\.)*invalid], with
the final octet being 0 also counting as invalid. This does not look
at the number of octets but making it look for exactly four octets
should be fairly easy (ideally, it should +match+ on anything with
one, two, three, or more than four octets, but leave four valid octets
alone. For logging purposes, getting all four into $MATCH would be
nice:

  * ^\/Received: from [^[( ]+ ?[[(]?(([a-z][-a-z0-9._]*)* ?)? ?[[(]\
        ((0|1?[1-9][0-9]?|2[0-4][0-9]|25[0-5])\.)*\
        (25[6-9]|[3-9][0-9][0-9]|[1-9][0-9][0-9][0-9])\
        (\.(0|1?[1-9][0-9]?|2[0-4][0-9]|25[0-5]))*(\.0)?[])]
  {
    LOG="$MATCH
"
  }

This is what I actually tested with -- I hope I didn't break it
somewhere along the way. Still no check for valid number of octets.)

I checked this quickly against the last few days' worth of spam from
the spam-list and found a handful of invalids. These were spams I had
already filtered on other grounds. Of the (67) spams my filters have
missed in the last couple of weeks, none were caught by this recipe.

is something like 4.0.994.63) but you can get pretty far by looking
only at Received: lines which are more or less like what Sendmail
generates and see if there's a "reverse lookup" number which looks
faked. The general format of these is 
  Received: from hostA by hostB (hostC [IP number])

Correction: Received: from hostA (hostB [IP number]) by hostC

Like I said, the above recipe only looks at the IP numbers in
Received: lines in exactly this form, with the slight modification
that I allow either normal or square brackets in both places, the
hostB is optional (as it would be when the IP number does't resolve),
and the spaces before the brackets are optional (spam software seems
to leave them out a lot, probably because the people who programmed
them don't have any aesthetic sense :-)

Hope this helps,

/* era */

Here's the matches I found and my other reasons for rejecting them:

 $ cat ~/scratch/inbox/spam-filtered.* | 
 formail -s procmail ~/scratch/testing/.rc
 Received: from clift.b89_crost.com (clift.b89_crost.com [199.3.12.256]
 X-Rejected: Spam score +1
 X-Rejected: Received: after From: [5]
 X-Rejected: Spam score 5
 X-Rejected: From killfiled domain @usa.net [5]
 X-Rejected: Over 6000 bytes [5]
 X-Rejected: body contains ugly words [5:18]
 X-Rejected: body contains too many URL:s [+13]
 Received: from in2.i_b_m.net (in2.i_b_m.net [165.87.194.259]
 X-Rejected: To: equals From: f5net(_at_)hotmail(_dot_)com
 X-Rejected: Suspect From: hotmail.com not in Received: lines
 X-Rejected: body contains ugly words [0:3]
 Received: From mailhost.alp.net(alt1engery.it.com(983.2.33.57)
 X-Rejected: No valid Message-Id
 X-Rejected: To: equals Reply-to: 973Jim(_at_)dsnnet(_dot_)it
 X-Rejected: Received contains earthlink
 Received: from clift.b89_crost.com (clift.b89_crost.com [199.3.12.256]
 X-Rejected: Spam score +1
 X-Rejected: Received: after From: [7]
 X-Rejected: Spam score 7
 X-Rejected: From killfiled domain @usa.net [7]
 Received: From mailhost.west.com(alt1west.com(333.2.44.55)
 X-Rejected: No valid Message-Id
 X-Rejected: Received: after From: [3]
 X-Rejected: Spam score 3
 X-Rejected: From .com [3]

Note that only four of these X-Rejected lines are not based on
somewhat fuzzy and/or risky heuristics (From: equals To:/Reply-to: and
No valid Message-Id:) and so adding the sure-fire IP number sanity
check would probably be a good idea.

Relying on the faked Received: lines actually containing well-formed
host names might not be too wise. I saw several Received: lines with a
host name with a shout mark in it in the testing material

-- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>

------------------------------

Date: Mon, 29 Sep 1997 11:50:36 +0200 (MET DST)
From: Vu Quoc Hu`ng <hung(_at_)vsb(_dot_)cz>
To: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Cc: procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE
Subject: [delete junk mail] How to delete a mail?
Message-Id: 
<Pine(_dot_)ULT(_dot_)3(_dot_)91(_dot_)970929113253(_dot_)21607A-100000(_at_)decsys(_dot_)vsb(_dot_)cz>
Content-Type: TEXT/PLAIN; charset=US-ASCII

hi all,
I haven't time, and I know you are guru in work with procmail. So I can 
ask for these problems:

1)What I must append to rc.maillists, so that I can delete immediatly 
mail from some one? 

2)How can I indentify the different between fields "Message-Id:" and
"From:", so that you see that you've got a "junk mail", and you can delete
it immediatly. How you implement it in procmail?

for example see:
From: Vu Quoc Hu`ng <hung(_at_)vsb(_dot_)cz>
                         ^^^^^^^ 
Message-Id: 
<Pine(_dot_)ULT(_dot_)3(_dot_)91(_dot_)970929103000(_dot_)16544A-100000(_at_)decsys(_dot_)vsb(_dot_)cz>
                                                             ^^^^^^^            
                                
You can see two fields contains the same host vsb.cz, it mean this mail is 
from a confidental people, so you will not delete it.

Please get me a copy of answer to:
                                                hung(_at_)vsb(_dot_)cz(_dot_)
+---------------------------+----------------------------------------+
     "Bill Gate's daughter will be the best product from MS".

Thank you very much.

--------------------------------
End of procmail-d Digest V97 Issue #297
***************************************

<Prev in Thread] Current Thread [Next in Thread>