procmail
[Top] [All Lists]

Re: scoring oddity

2000-10-10 11:55:30
Dallman Ross <dman(_at_)nomotek(_dot_)com> writes:
...
    :0 i  # two+ `!' or `$' symbols in headers . . .
    * ! ^(Return-Path|Sender):.*list
    * -175^0
    *  100^1 [$!]
    * -100^1 ^Message-ID:.*[$!]
    { do some other things if we're here }

So, I had a piece of mail come in that had two `$' chars in the
Message-ID: header.  There were no other `!' or `$' signs in the
headers.  The logs, though, showed a total score for this section
of -75, not -175!  I don't understand how that could be.

When a regexp is being matched multiple times (such as when scoring),
each search is started where the previous one left off**.  That's how all
regexp engines keep from finding the same match over and over again.  So,
that last condition will only match against a given Message-Id: header
field once, because it'll start the second search after the dollar sign
or bang.

The solution in this case is to extract the value of the Message-Id:
header and then count the $s and !s in MATCH:
        :0
        * ! ^(Return-Path|Sender):.*list
        * -175^0
        * 100^1 [$!]
        {
            # $= now contains the score for the entire header.  Correct
            # for the Message-Id: header field.  Start with the current
            # score and go from there.
            :0
            * $$=^0
            * ^Message-Id:\/.*[$!]
            * -100^1 MATCH ?? [$!]
            { }
            
            # $= now contains the corrected score.  If it's greater than
            # zero still, then do whatever
            :0
            * $$=^0
            { whatever }
        }

Yes, that last nested recipe could be written as
            SCORE = $=
            :0
            * SCORE ?? ^[1-9]

but that seems to hide the intent.  Just treat it as a score and move on.


Philip Guenther


**It's actually more complicated than that if the match started or
stopped (I can't remember which) with a newline.  Procmail then has to
back up one character so that the newline can be matched by both a
leading ^ and a trailing $.  It's this odd effect that makes the obvious
line count recipe result give an answer one too high.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>