On Sunday, January 19, 2003, at 11:12 AM, Louis LeBlanc wrote:
Hey folks. I have a quick question:
How long have those stupid spammers been inserting html comments into
HTML spam to sneak by filters meant to keep them out?
And more importantly, is there a relatively easy way to filter every
HTML message thru a dump to eliminate any HTML tags? Seems if I could
do that, those messages with stuff like this wouldn't get thru
anymore:
you can strip the comments form the html before you start other checks..
| sed -e 's/<!--[^-]*-->//g
I think would do it, unless the comments break lines.
you could also do something like
| sed -e 's/<!--/\r<!--/g' \
-e 's/-->/-->\r/'
and then
| sed -e '/<!--/,/-->/d'
I would simply count comments ("<!--" in an html message and discard if
there are more than... oh, I dunno, some threshold. Like 2. maybe 3.
This amateurish html obfuscation doesn't concern me nearly as much as
the Base64 stuff.
--
Love is like oxygen/You get too much/you get too high/Not enough and
you're gonna die
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail