procmail
[Top] [All Lists]

Re: Base64 Spam.

2003-08-21 15:24:59
On Mon, Jun 16, 2003 at 02:41:55PM -0400, Paul Chvostek wrote:

(Sorry, this was sent to the list on the day I flew out of the
country in June.  When I returned some weeks later, I culled
through a couple thou list messages.  I saved this one until I
had a spare moment to think straight.) :-)

On Fri, Jun 13, 2003 at 12:25:21AM +0200, Dallman Ross wrote:

      :0  # 030504 () where's the "multipart"?  There's just one encoded 
part
       *                  CTYPE  ??  ^^multipart/mixed
       *          2^0  B         ??  ^Content-Transfer-Encoding:(.*\<)?\
                                      (base64|7bit)
       * $ -$MAXINT^0
       *         -1^0  B         ??  ^Content-Type:(.*\<)?text/plain
       *         -1^1  B         ??  ^Content-Type:
       * $  $MAXINT^0
       { RX = "${RX:+$RX, }UBE.B+CT.MISMATCH:1" }

I think I get what this is doing, but wouldn't something like this be
simpler?

  :0
  * ^Content-Type: multipart/
  {
    :0 B
    * 1^1   ^Content-Type:
    { PARTS=$= }

    :0 A
    * PARTS ?? ^^1^^
    { W = "multipart: just one part" }

    :0 E
    * PARTS ?? ^^0^^
    { W = "multipart: no parts" }
  }

Simpler, certainly.  The point of my ol' Infinity Bop is to
minimize impact on the server.  First, I have one recipe while
you have four.  Granted, nested braces are pretty _de minimus_.
If you find it aids readability that much, then go for it.

Second, though, your first subnexted recipe will run
from start to tail through every (long) multipart email.
That was specifically the needless step I wished to avoid.
By ratcheting up against the max-init value of procmail's
scoring engine ("infinity"), we can stop processing after
the second part has been found.  That will save a bit in
every legitimate multipart message.  We'll only need to
keep searching to the end on actual culprits.

This line

 *          2^0  B         ??  ^Content-Transfer-Encoding:(.*\<)?\
                                (base64|7bit)

is where I skim two off the top before going to negative-
infinity.  The two points keeps us from bailing instantly.
Although I wanted to test for the existence of the line on the
right unequivocally, I used the same line where I set the
score plus-two, in a manner of doing double-duty where the
two discrete elements of the algorithm must intersect.

That looks like gobbledygook, so let me state it otherwise:
I want to find "^Content-Transfer-Encoding:"-blah in there.
If it's not there, the two points won't be added to the score.
And when we get to the next line, negative-infinity, the
recipe will bail!  And if the line *is* there, well, I
don't want to bail yet, and I also want to add two points
before I set negative infinity (so that we won't bail!).
So I did both operations on one line.

-- 
dman

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Base64 Spam., Dallman Ross <=