procmail
[Top] [All Lists]

Re: matching on bad word list but which word was it ?

2006-08-20 04:38:22
On Sun, Aug 20, 2006 at 12:50:06PM +0200, Matthias Häker wrote:

Dallman Ross schrieb:
You need to use the match token, "\/" first to grab the word.
(See man procmailrc page and look for "'\/' token".)


   * ()\/(bad1|bad2|bad3|bad4|bad5)

The "()" is an empty group, used because a "naked" slash used
in that first position in the regex needs quoting, but slashes
in other positions don't.  We could also have used another slash
as a quote char, but the parens are easier to understand and
the "received wisdom" to use in this case among procmailers.

OK !

so i go to use ()for a leading \ in the regex

like

*  B ?? ()\<iframe src=3D 

and to  read qhat Procmail matched  into a var

*  B  ??  ()\/ \<iframe src=3D 
{ BAD=$MATCH }    

Well, in the last condition, why did you suddenly introduce
whitespace?  What if there's no whitespace before "\<iframe"?
I think you want:

   *  B  ??  ()\/\<iframe src=3D 

But, testing, no, that isn't what those messages say.  Here's one:

 1:25pm [~/Mail/virus] 541[1]> grep -h src=3D *.*
 src=3Dcid:031401Mfdab4$3f3dL780$73387018@57W81fa70Re height=3D0 
width=3D0></iframe> 

(Probably you aren't reading this message now, if your simple filter found
my posting.  But since your regex is wrong, it won't have, anyway.  Others
on this list might not see this, though.)

So that would need to be:

   * B ?? ()\/src=3D.*</iframe.*

That works.

However, whitespace including line breaks are permitted in HTML.  So the
virual message could have that string broken across multiple lines.  You'd
need:

   * B ?? ()\/src=3D.*($.*)*.*</iframe.*

That also works in my simple test.

Now I have to confess that I made an error earlier in what I said about
quoting.  The backslash will need to be quoted wherever it is.
You can use another backslash, or use "[\]".
   
    
Dallman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail