procmail
[Top] [All Lists]

Re: scoring Q: repeating chars?

2002-01-28 13:15:18
Sean asked,

| I want to generate a weighted score where certain characters repeat in the
| subject:

| :0:
| * -150^0
| * 30^2 ^Subject:.*\!
| folder
|
| There are other examples, but this is a straightforward one.  Problem is,
| it scores as if there was only one '!'.  Same result if I put it into
| brackets.

Procmail will consider only non-overlapping matches to the regexp, and you
won't find four *non-overlapping* occurrences in, say,

Subject: BUY THIS!  NOW!!!

The only way you'll get more than one match to that regexp is to have
multiple Subject: headers on the message, at least two of which contain
exclamation pernts.

There is one exception to the requirement for non-overlap: if the text
matched to the regexp ends with a newline (real or putative), procmail backs
up one character to start searching again before the newline.  That way, in
a condition like this,

* 1^1 ^something$

the same newline that served as $ on line 4 can do double duty as ^ on line
5, and you won't miss a match if two successive whole lines (or consecutive
groups of whole lines) are matches.

| Thus far, the only reliable method I've found to accomplish this is to
| extract the subject into a variable, then match:
|
| * 30^2 SUBJECT ?? [!]
|
| Is this the only way to accomplish what I'm after?

That's the solution; you already knew it.

| It seems like
| the expression at the top should accomplish the same thing, but does not.

| I can't help but think I'm missing something.

You're missing the memory from when you first learned that matches have to
be non-overlapping.



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>