On Sat, 24 May 2003, David W. Tamkin wrote:
DWT> isn't a number.
DWT>
DWT> * 140^1 ()<!--.*-->
DWT>
DWT> Second, I'm not sure that embedded HTML comments can't span newlines.
DWT> You might need some incantation like this:
DWT>
DWT> * 140^1 ()<!--([^>]|$)*-->
DWT>
Yes, you are probably correct - HTML just ignores newlines I think.
I've been mulling this over in the shower / mowing the lawn / etc.
Count word boundries and comment boundries and compare
* -1^1 \<
* 20^1 ()<!--
but then a word boundry could appear inside the crud in a comment. Maybe
that wouldn't matter, given the characteristics of the spam emails, if I'm
are saying that if the number of comments is greater than 5% of the number
of words then it is spam. At twenty real words for each comment to allow
the mail through, a normal html email should be OK but the spam, with a
comment embedded in each real word, would require 19 word boundries
within the comment to get through.
Maybe it should be * -1^1 ()\< to stop it thinking I am quoting the <.
Alan
( Please do not email me AS WELL as replying to the list. Please
address personal email to alan+1@ as lists@ is not read. A
password autoresponder may be invoked if this email is very old. )
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail