Rik Kabel wrote,
| I have had some results which I do not understand while working with a
| scoring recipe. The problem is distilled in the following two examples.
| I am using v3.11pre4.
| In the first example, the score of 121 is computed as 2 + 4 for dong and
| the first bing(o), + 5 for dong, + 10 + 100 for bingo bingo. Since scoring
| uses a shortest match, there is no score of 8 for the second bingo.
| ====Example 1====
| This message scored 121 on
| :0 H B
| * ::score::
| * 2^2 ^.*\<(b|d)(i|o)ng
| * 5^5 (b|d)ong
| * 10^10 bingo\>
|
| +To: rik
| +Subject: ::score::dong
| +
| + bingo bingo
| +
Well, let's restate that last part: because scoring looks only for
non-overlapping occurrences, and the second condition is left-anchored,
only one [bd][io]ng match it per line. So where [bd][io]ng appears twice
on a line (" bingo bingo"), procmail then scores only one of them, and it
chooses the shorter (through the first "bing", not through the second).
Anyhow, on with Rik's question:
| In example 2, the score of 11 is computed as 2 + 4 + 5 as in example 1,
| but there is no score for either bingo.
|
| ====Example 2====
| This message scored 11 on
| :0 H B
| * ::score::
| * 2^2 ^.*\<(b|d)(i|o)ng
| * 5^5 (b|d)ong
| * 10^10 \<bingo\>
|
| +To: rik
| +Subject: ::score::dong
| +
| + bingo bingo
| +
Ah, the old leading backslash problem. Procmail takes a backslash at the
start of a regexp to mean "end of whitespace". Then it looks for "<bingo\>"
with a literal less-than sign at the beginning, which does not appear at all.
Hence the fourth condition scores zero.
Try one of these instead:
* 10^10 ()\<bingo\>
or
* 10^10 (\<bingo\>
or
* 10^10 (\<)bingo\>
Then the recipe should score 10 (not 110, because the only way to find
\<bingo\> twice is to allow overlapping).
In fact,
* 10^10 \\<bingo\>
is equivalent but highly counterintuitive (we expect "\\" to represent a
literal backslash).
This also works but is slightly slower:
* 10^10 .*\<bingo\>
Stephen's recommendation (and general consensus of the list membership) is
to use the first method [leading with "()"].