procmail
[Top] [All Lists]

RE: new features for procmail

2005-11-14 11:31:47
On Mon, November 14, 2005 12:49 pm, Gary Funck said:


- In addition to PCRE matching, implement "approximate matching",
  ala String::Approx (on CPAN)

http://search.cpan.org/~jhi/String-Approx/Approx.pm
I think it is hard to make that practical, since it basically works on
short strings like single words. It would not work very well against the
variant spellings of 'viagra', because the 'Levenshtein edit distance'
can be made arbitrarily big.


I tend to disagree.  Not on the technical point per se, except that for
many viagra spellings the edit distance is pretty small.  Other
approximate matching algorithms exist, and those can be evaluated as
well.  But you're right about the intent ... to make it easy to catch
mis-spellings, mainly for the purposes of spotting spam phrases.
But not always (think of the mis-spellings of "subscribe" and other
words that come up in practice).

Ach! Stay away from spam stuff. It is too much of a moving target and is
much better handled by specialized tools (even if those tools are written
in procmail). As several people have pointed out, very effective spam
filters can be created using some smart scoring of the headers without
ever looking at the body.

If procmail is going to continue to be the mail-processor-of-choice then
it needs to do that job better and faster and with less resource-use than
other choices. I like using procmail but I don't like writing complicated
recipes in it. So, improve the syntax and add some features needed to
improve the job it needs to do but stay away from adding functionality
targeted to making the job of spam-filtering (or other high-level
decision-making) easier. There's nothing wrong with a
sendmail->procmail(deliver the stuff I know is
good)->spam_test/mime_test->procmail(deliver the spam/ham) message-flow as
long as the resource footprint stays small.

Rich

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>