At 13:01 2004-05-28 -0700, Jim Osborn wrote:
I've tried various regex to limit a match to just the words
I'm looking for. Using the obvious:
SPAMSCORE = "MSGID,NOBODY"
:0
* 1^1 SPAMSCORE ?? ()\<(MSGID|NOBODY)\>
* 1^1 SPAMSCORE ?? ()\<(MSGID|SUBJ|BODY)\>
{}
I expect a score of 2 on the first condition.
You seem to be expecting that the comma in the middle will be considered
TWICE as a wordbreak - once at the end of one word and again before the
beginning of the other. As a demonstration, add a second comma in the
middle of the string - then you'll see that both keywords are in fact
evaluated, but the wordbreak must be SEPARATE in each evaluation.
and if it does, why don't each of those match their respective
portions of "xxx,yyy"? Does the procmail scan "eat" the ',' once
it's seen "xxx," so it's not available to match ",yyy"?
Pretty much.
In any case, can someone straighten me out on the correct way to specify
a set of bounded words?
Change how you bound them?
If _LOOKS_ as if you may be composing a string which you then want to score
to see how many keywords might be in it. If so, construct the string with
bounds around EACH token:
SPAMSCORE = "[MSGID][NOBODY]"
Then, when you go looking, each set of bounds is specific to the token.
This doesn't even require a change to your current regexps.
You owe me, uhm, lessee... One sixpack of MacTarnahans Blackwatch. That'd
be an unopened sixpack. <g>
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail