procmail
[Top] [All Lists]

Re: Phantom lines in empty body

2004-06-21 03:50:54
On Sun, Jun 20, 2004 at 08:58:34PM -0700, Jim Osborn wrote:
On Sun, Jun 20, 2004 at  5:16:46PM -0700, Gary Funck wrote:
My favorite was:

STOP=-9876543210
SPACE = " "
TAB = "     "
NL = "
"
WS = "$SPACE$TAB$NL"

:0 B:
* 1^0
* $ $STOP^0 [^$WS]
empty-body.mbox

If I'm not mistaken, the differences between Gary's favorite and my
original recipe is the inclusion of a newline in the second condition,
as well as the large -w value on the scoring part.  I haven't thought
through all the implications of including the newline, but the large
-w is definitely a winner.

See my previous follow-up; but, no, your original posted yesterday
was quite different, the differences not merely limited to the w
value and what's in $WS.

You had posted:

  :0 B
  *  1^0
  * -1^1 ^.*[^  ]+.*$

You are counting all non-whitespace body lines in every message,
checking each line from anchor-left to anchor-right, and moreover with
no limitations on message size; all of which is extremely enifficient.

Other algorithms shown in the thread are all more efficient.  However,
if you for some reason wanted to use the algorithm you've shown,
you could increase its efficiency manifold by either (a) limiting
the recipe to messages with small bodies, similarly to what I
demonstrated; or (b) using the "infinity shuffle" technique to
stop your recipe when it finds its first nonwhite char.

  MAXINT = 2147483647         # this one's not oversaturated
  :0 B
  *         -1^0
  * $  $MAXINT^0
  *         -1^1 ^.*[^  ]+.*$
  * $ -$MAXINT^0
  { not_an_empty_body = y }
  

I've done that merely for more demonstration purposes; I don't
recommend the general tack you proposed for finding blank-only
bodies.

Thing is, on my ISP's setup, the :0 B part fails.  Even with 1^0 as
the first condition, I get over 1K plus score, when by rights, the

I suspect this is an issue to do with the very old version you are
using there -- you said it was 3.15.1, yes?  But I don't know.  That
is just a guess.

-- 
dman

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail