procmail
[Top] [All Lists]

Re: Phantom lines in empty body

2004-06-21 03:36:44
On Sun, Jun 20, 2004 at 05:16:46PM -0700, Gary Funck wrote:

This subject was discussed while back (follow the follow-up links):

http://www.xray.mpe.mpg.de/mailing-lists/procmail/2004-02/msg00169.html

My favorite was:

STOP=-9876543210
SPACE = " "
TAB = "       "
NL = "
"
WS = "$SPACE$TAB$NL"

:0 B:
* 1^0
* $ $STOP^0 [^$WS]
empty-body.mbox

It's not at all bad; but two small points.  One,
even more efficient to put the "1^0" below the other
condition.  (The difference is extremely tiny; you'd
never even notice.  But I'm just being obsessive.)  :-)


But, two, we don't need a scoring recipe here at all, I
don't think.  This should do the same thing, unless the
two cups of coffee I've had already weren't enough
(which might, indeed, be the case):

  :0 B:
  * $ ! [^$WS]
  empty-body.mbox


That works with the $WS as you defined it above, inclusive
of a $NL.  Without that, what I posted last night also still
looks good to me:

    :0
    * $ B ?? [^$WS]
    { }
    :0 E:
    empty-body.mbox

(I've changed the action line to suit your example and added a
lockfile for the same reason.)

That one works with $WS set to "$SPACE$TAB", which is, e.g.,
what the var will have been set to if you're using Virus Snaggers.


The first one from last night:

  :0
  * B ??  1^1 > 1
  { BSIZE = $= }

  :0 E
  { LOG = "$NL Body was null! $NL"  HOST }

Is also useful for stopping messages with no body at all, and
also to give us a $BSIZE var that we can use later.  If the
body is too large, we don't need to bother with certain later
tests, was my point in doing it that way.

I don't know the following absolutely for sure, but I believe
that the size operator ( > or < ) is less work for procmail to
perform on the body than would be running a condition -- even
an efficient one such as 

    * $ B ?? [^$WS]

against all messages, large and small alike.  I think that once
we parse the body with a regex we are sucking the entire
message down the pipe, while it may well be -- it is my
hope and my assumption, but not my knowledge -- that > or <
would use information already gleaned from the start of the
run.  Maybe someone better at C than I am (not all that hard)
could look at the source code and say for sure one way or the
other.

-- 
dman

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>