procmail
[Top] [All Lists]

Recipe to match within text - need help

2003-05-30 11:33:15
Hi all.  I'm having trouble coming up with a recipe for what I need and I
was hoping someone here could help.  I've studied examples and FAQ pages
but can't get anywhere.

I have an HTML classified ad submission form on my website (see:
http://www.immuneweb.org/classifieds/ ) The submissions come through as
email and I put up the ads by hand.  Unfortunately, spammers have gotten
ahold of the specs and I get 100-200 spam submissions every day.  No that's
not a typo.  I get 5-10 legit ads per week.

Since there are only a dozen or so key words in the text of these emails
that I need to ID 95% of them (without ever getting a false positive), I
thought I would use procmail to sort the spam ones into a folder.  Later,
when I'm convinced the code is right, I'll sort them into the trash.  The
fact that I never ever actually post any of these ads hasn't slowed the
bastards down.

Here's an example of one of the emails:

____________________________________________

X-Coding-System: undecided-unix
Mail-from: From anonymous(_at_)immuneweb(_dot_)org  Fri May 30 17:45:18 2003
Return-Path: <anonymous(_at_)immuneweb(_dot_)org>
Delivered-To: classifieds(_at_)immuneweb(_dot_)org
Received: (qmail 3665 invoked by uid 27627); 30 May 2003 17:45:18 -0000
Date: 30 May 2003 17:45:18 -0000
Message-ID: <20030530174518(_dot_)3664(_dot_)qmail(_at_)immuneweb(_dot_)org>
From: anonymous(_at_)immuneweb(_dot_)org
Cc: recipient list not shown: ;
Reply-to:
Subject: Classified Ad Submission


to = classifieds(_at_)immuneweb(_dot_)org
subject = Noncommerical Classified Submission
form = http://www.immuneweb.org/classifieds/submitnoncom.html
admin = classifieds(_at_)immuneweb(_dot_)org
Background Info =
Real Name = Devika Rani
Real Email = devika_opps2003(_at_)yahoo(_dot_)co(_dot_)in
Ad Information =
Subcategory = Employment Offered
new category =

2 = Begin Text of Ad
Text of Ad = Finally! A Real Work @ Home Opportunity has arrived! Now you
can become an Independent Typist with Ad-Placer.com. We offer home workers the
opportunity to earn extra money from the comfort of their own.  visit
http://www.ad-placer.com/35774ads.html^M
e-mail: devika_opps2003(_at_)yahoo(_dot_)co(_dot_)in

3 = End Text of Ad
Anon box = No
Anon Box Forward =

 = General Comments
Additional =
-----------------------------------------------

____________________________________________


I started off by using this recipe:

:0:
* ^From: anonymous(_at_)immuneweb(_dot_)org
$HOME/spambouncer/blocked/classifieds

This puts *all* the classified submissions into that folder.

Now I'd like to run text matching on those emails that already match the
initial statement (from anonymous(_at_)immuneweb(_dot_)org).  But I can't 
figure it
out.  Everything I do causes the match to fail.  I am trying to match with
"Ad-Placer" in the text and ran several tests to no avail.  I'm sure this
is easy but I just can't figure it out.

I tried:

:0:
* ^From: anonymous(_at_)immuneweb(_dot_)org
* .*Ad-Placer
$HOME/spambouncer/blocked/classifieds

and

:0:
* ^From: anonymous(_at_)immuneweb(_dot_)org
* ^(^|[^-_0-9a-z])Ad-Placer([^a-z0-9\.]|$)
$HOME/spambouncer/blocked/classifieds

It's pretty clear I'm missing something obvious (like a command that says,
check the body of the email not just the header), but I don't know what.

I would like to be able to eventually have a recipe that looks like this:

:0:
* ^From: anonymous(_at_)immuneweb(_dot_)org
and a text match on any of the following: a b c d e
Go to dev/null

If anyone has the time to give me a hand writing this, I'd appreciate it.

Thanks,
Cyndi

-- 
_______________________________________________________________________________
"There's nothing wrong with me.  Maybe there's                     Cyndi Norman
something wrong with the universe." (ST:TNG)                   
cyndi(_at_)tikvah(_dot_)com
                                                         http://www.tikvah.com/
_________________ Owner of the Immune Website & Lists http://www.immuneweb.org/

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail