procmail
[Top] [All Lists]

Re: about procmail regex operators

2010-03-28 18:07:02
On Sun, Mar 28, 2010 at 3:05 PM, Harry Putnam <reader(_at_)newsguy(_dot_)com> 
wrote:
What I was trying to get at is that the scoring technique doesn't
really find 15 alphnumerics in a row.

No, it doesn't.  You could probably do something like count the number
of spaces and the number of non-spaces ... for example, a typical
phrase in English should have not much more than about six times as
many non-spaces as spaces.  E.g., this paragraph has 53 spaces and 240
non-spaces in it.

:0
* ^Subject:[      ]\/.*
* MATCH ?? > 25
* MATCH ?? 1^1 [^ ]
* MATCH ?? -7^1 [ ]
{ LOG="More than 25 characters and more than 7 times as many
non-spaces as spaces
" }

What do you think?   Is scoring a better way?

I long ago gave up trying to maintain my own spam rules directly in
procmail, but the people who publish their procmail spam filters use
scoring a lot.  You can get fancier than I did above, e.g. use
fractional values to the right of the caret so that more spaces count
differently toward the final score.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail