Re: Something like ereg

On Fri, Jul 04, 2003 at 05:08:13PM -0700, Bj?rn Lilja wrote:


What I would need is the ereg's to work on the e-mail contents as
usual but only after removing anything within < and > tags,
basically doing something like ereg_replace("<.*>", "") and then do
the normal eregs! This is of course to try to make it harder to do
simple filter avoidance by typing "F<dsd>R<fdf>EE" instead of
"FREE".


I think you'll find this very difficult to do reliably with regular
expressions of any kind. Think that you will have to deal with legit,
complex tags as well (eg: <a href="http://someplace"; 
target=_top>Hiya doin</a>), nested tags, tags that may span multiple 
lines, etc.

If you want to go down that road, you probable want to use one of the
html -> text converters like lynx or w3m. Which would mean piping to a
script, converting, then parsing the converted file. Certainly
possible.

Just because of the situation you describe, I went to a whitelist
approach, and then dump _any_ html after that as spam. This has been
*very* effective, at least for me.

-- 
Hal Burgiss
 

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Question on restricting incoming mail from Mail Lists, Noarmun E. Mailer

Next by Date:

Re: Question on restricting incoming mail from Mail Lists, Professional Software Engineering

Previous by Thread:

Something like ereg_replace?, Björn Lilja

Next by Thread:

RE: Something like ereg_replace?, Björn Lilja

Indexes:

[Date] [Thread] [Top] [All Lists]

Re: Something like ereg_replace?