procmail
[Top] [All Lists]

Re: Filtering URL In Message Body

2000-12-17 14:56:04
At 03:14 AM 12/17/00 -0600, Philip Guenther wrote:
So, to match "http://blah.blah";, you would use "http://blah\.blah";
To be precise and to cite the procmail source code itself, the only

Thank you so much, Philip...you have just helped solve a MAJOR
problem here.  And I hope Santa is VERY good to YOU this year :-)

>2.  How would I code it to also filter on specific URL's which contain ANY
>number(s) ???

_Any_ numbers?  What about "http://www.3com.com/";?  Perhaps you _all_
numbers:

Gee...you are right...I didn't think about this.  Most of the problems involve
2 (or more) digits together *somewhere* between the "http://"; & the end
(whether it be a .com or a .cn or a .com.cn URL.  At this point, I'd just like
to thrash ANYTHING like http://www.163.com  http://www.21.com  or even
http://something2786.com  !!!  Can the below be tweaked to do this simply?

        :0 B
        * http://[0-9.]+([^a-z_]|$)
        /possible/spam


Hmm.  When I wrote the site-wide spam filter at my last job I found
that IP-address URLs were too commonly used for legitimite (albeit

... Thanks for the other detailed recipe.  Yikes.  Since I get over 500 pieces
of this stuff per day (including the Chinese), I don't have time to sift through
"possible/spam" ... I just want to thrash it all.  Anyone who wants to reach
me badly enough knows how to contact me :-)

Thanks for your help !!!

Eric


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>