procmail
[Top] [All Lists]

Re: Base64 Spam.

2003-06-13 14:17:18
On Fri, 13 Jun 2003 09:17:55 +0200, Dallman Ross <dman(_at_)nomotek(_dot_)com>
wrote:

. . . what was puzzling me is some
legitimate emails got caught through some of the filters.

I used a sandbox to test the following (not all mine, some are, and
some from the discussions in the archives):

Since you are using a sandbox and have a log, here's a suggestion:
turn on verbose logging and see what exactly caught it and what
didn't.

I did from the first minute (I would never run this in a sand box
without verbose logging), and I tried the legitimate message and few
other old spam messages that I kept for such tests, they were treated
equally.

Anyway, I guess that I have to read messages that are legitimate but not
white listed a bit late. :)


I think you will find the first recipe set (which is the one I
presume you got from the archives) works as advertised, while the
second (which is the one I presume you wrote) is giving you the
false pozzes.


## Base64 encoded html spam in message headers
:0  
  * ^Content-Type:(.*\<)?text/(html|plain)
  * ^Content-Transfer-Encoding:(.*\<)?base64
 {
 LOG="Base64 Encoded SPAM Headers $NL"
 LOGABSTRACT=ALL
 :0:
$base64spam-headers
 LOGABSTRACT=NO
}


While the recipe set looks okay to me, I bet you will discover
that you get the exact same log output if you leave those
LOGABSTRACT lines out altogether.  Certainly you don't need
to turn on "all" to quit logging after one recipe.  The logging
of the last recipe is the default, anyway.  And as soon as
your spam gets past your first LOGABSTRACT invocation here,
it is saved, and procmail ends, so the second invocation won't
ever even happen.  

Thank you for the tip.
I wasn't aware of that.


 Base64 encoded html spam in message body.
:0
 * B ?? (Content-Type:.*text/html;)
 * B ?? (Content-Transfer-Encoding: base64)

There is no reason for the parentheses in either condition.
You can just lose them.


OK.

 {
 LOG="Base64 Encoded SPAM in Body $NL"
 LOGABSTRACT=ALL
 :0:
$base64spam-body
 LOGABSTRACT=NO
}


Since your shmancy logging directions aren't doing anything
useful, there really is no reason for the nested braces at
all.  Just run your recipe, with a lock, and with the filename
to save to on the action line.


OK.


And you want to put ^ at the start of the expression, anyway, to
save procmail lots and lots of work looking rightward of the start
of each line when it doesn't need to.




Okay, so you have a recipe that catches too much, but you didn't
analyze the (verbose) logs to see if you could help it, 

No, I did.
But again, there was nothing different between both messages.

and instead
you want, right away, to run an external base64 decoder on all this
mail so you can further grep bodies for spamish content, all for
the "very few spammers" who are messing with you this way, and
that's a helluva capitulation, process-wise, to have to make.
It isn't necessary.  Let's fix your broken recipe and allay the
false pozzes right there.


      :0:
      * B ?? ^Content-Type:(.*\<)?text/.*\
              ^Content-Transfer-Encoding:(.*\<)?base64
      $base64spam-body


Try that in your sandbox.


OK, this will fail on messages with no Content-Transfer headers, which I
have seen lots of messages with such mess.

Here is a sample I tested.


Spam Message:

From jenetterarg(_at_)excite(_dot_)com  Tue Jun 10 14:34:13 2003
Return-Path: <jenetterarg(_at_)excite(_dot_)com>
Received: from excite.com (adsl-67-112-111-117.dsl.scrm01.pacbell.net
[67.112.111.117])
        by mydomain (8.11.6/verio) with SMTP id h5AIYAN28232
        for <multimedia-fan(_at_)mydomain>; Tue, 10 Jun 2003 14:34:11 -0400
Message-ID: <000100b3ec31$ddd74832$80065612(_at_)lgpkvqa(_dot_)rpq>
From: "Septic Savior" <jenetterarg(_at_)excite(_dot_)com>
To: "S E P T I C  Clog"
Subject: End Septic Problems - TRY SPC RISK-FREE FOR THIRTY DAYS!
3591LfXs5-340fhnu55-18
Date: Tue, 10 Jun 2003 06:13:50 +1200
MIME-Version: 1.0
Content-Type: multipart/mixed;
        boundary="----=_NextPart_000_00E3_01B14A3A.E0116B77"
X-Priority: 3
X-Mailer: Microsoft Outlook Express 5.00.2615.200
Importance: Normal

------=_NextPart_000_00E3_01B14A3A.E0116B77
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: base64


PEhUTUw+PEZPTlQgQ09MT1I9IiMwMDAwMDAiIFNJWkU9MyBGQU1JTFk9IlNB
TlNTRVJJRiIgRkFDRT0iQXJpYWwiIExBTkc9IjAiPjxGT05UIENPTE9SPSIj
ZmZmZmYiPjE0MTY3MDc3MzcyMTc4MTQzMjgxODc1MjUwNzM3MzM3Njg2MzA2

[Rest of message body deleted for space considerations]


Here is the log.

procmail: No match on
"^Content-Type:(.*\<)?text/.*^Content-Transfer-Encoding:(.*\<)?base64"
procmail: Locking "/var/spool/mail/multimedia-fan.lock"
procmail: Assigning "LASTFOLDER=/var/spool/mail/multimedia-fan"
procmail: Opening "/var/spool/mail/multimedia-fan"
procmail: Acquiring kernel-lock
procmail: Unlocking "/var/spool/mail/multimedia-fan.lock"
From jenetterarg(_at_)excite(_dot_)com  Tue Jun 10 14:34:13 2003
 Subject: End Septic Problems - TRY SPC RISK-FREE FOR THIRTY DAYS!
  Folder: /var/spool/mail/multimedia-fan
3959


Of course I can understand why this happened in this message, since you
were looking for the header.

Also such message would be caught by checking against messages that have
Content-Type: multipart/mixed; header but have only one part.






_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>