On Thu, 30 Oct 2003, Professional Software Engineering wrote:
ALthough I didn't spot it anywhere, I believe what Dallman is saying is
that rather than expecting to match on the drug keyword, the fact that you
match a lot of HTML COMMENTS in an EMAIL should be sufficient to tag it as
spam.
I seem to recall a discussion about this a while back that suggested
scoring based on the number of comments per line. Something like
:0 B
* -1^1 ^.*$
* 1^1 (<!)
You could do somethign similar with HTML tags, but probably allow more
than one per line, e.g.
:0 B
* -3^1 ^.*$
* 1^1 (<[^>]+>)
You could further attempt to count matched open/close tags as only one
rather than two, but I'm not going to try to work that out right now.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail