On Sun, Mar 28, 2010 at 3:05 PM, Harry Putnam <reader(_at_)newsguy(_dot_)com>
wrote:
What I was trying to get at is that the scoring technique doesn't
really find 15 alphnumerics in a row.
No, it doesn't. You could probably do something like count the number
of spaces and the number of non-spaces ... for example, a typical
phrase in English should have not much more than about six times as
many non-spaces as spaces. E.g., this paragraph has 53 spaces and 240
non-spaces in it.
:0
* ^Subject:[ ]\/.*
* MATCH ?? > 25
* MATCH ?? 1^1 [^ ]
* MATCH ?? -7^1 [ ]
{ LOG="More than 25 characters and more than 7 times as many
non-spaces as spaces
" }
What do you think? Is scoring a better way?
I long ago gave up trying to maintain my own spam rules directly in
procmail, but the people who publish their procmail spam filters use
scoring a lot. You can get fancier than I did above, e.g. use
fractional values to the right of the caret so that more spaces count
differently toward the final score.
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail