At 12:14 2002-06-05 -0400, Monah Baki wrote:
:0
* ^Subject:
.*(*fun|*bills|*weight|*credit|*casino|*paid|*FREE|*save|*congrats|\
*free|*CREDIT|*Free|*save|*PAMELA|*win|*rent|*online|*sex|*money|*income)
/dev/null
I was wandering if this will work???
To to do what?
The regex syntax is hosed: * should follow a character or a wildcard
'.' If you have .* at the beginning, putting * inside each or'd condition
is rather excessive in any event.
I hope you don't get any messages like "I have a problem with X windows",
"what's the function", "saving cycles on complex regexp", etc. Consider
how broadly your keywords will apply since they'll match as substrings in
other words.
See my disclaimer for a link to information on a sandbox configuration - a
basic procmail wrapper into which you can include test recipes, then throw
messages at them and examine a log -- all while leaving your REAL email
safely alone until you're confident that a recipe will work properly. A
sandbox will answer many questions for yourself.
You should consider using scoring ('man procmailsc'): you can count up
points for each time a certain word appears in the subject (or anywhere
else for that matter), and if the total points exceeds some threshold, you
ditch the message. You can carry over scoring from one recipe to another,
so you could add up the score for subject keywords, stuff it into a
variable, then do other tests and add the subject score to THOSE tests, and
if the total exceeds your threshold, you toss the message.
A simple example follows (note that for various reasons elsewhere in my
procmail config, I extract the subject into a discreet variable, so unless
you're doing the same, this won't work as-is for you). There is additional
material excluded from this, and this is just one of _many_ spam tests I
run. Specific scoring can be adjusted to preference - see the sandbox
config I present - use formail to split an existing mailbox through into
this in a sandbox config and tweak as necessary.
Some keywords and phrases you might normally ditch right away aren't
high-scored here merely because on a forum in which I participate, some of
the members discuss junkmail, and in doing so, they not infrequently
utilize some of the keywords in the subject.
Note particularly the relatively low scoring value for a slew of common
keywords at the beginning - these individually won't trigger as spam, but
they'll _ADD_ to the total, and they start to add up quickly if some of
those words appear multiple times.
# Start with a negative credit.
:0
* -135^0
* -10000^0 SUBJECT ?? Some-specific-string-used-in-spam-reporting.
* 120^0 SUBJECT ?? [ ][ ]+(\[\(<)?[0-9][0-9][0-9]+(\]\)>)?[ ]*$
* 25^1.7 SUBJECT ?? (hardcore|affordable|better|insurance|adult|picture|\
gallery|\<new\>|\<works\>|looking|database|report|search engine|\
library|internet|web(\-|\ |)hosting|domains|\<need\>|\<quick\>|\
forget|remind|unique|\<play\>|\<pay\>|dominate|\<beat\>|\<heat\>|\
announcement|\<deep\>|\<call\>|\<pressure\>|worldwide|\<only\>|\
\<help\>|\<big\>|solution|\<attend\>|invitation|invited|perfect|\
system|package|consumer|effective|affordable|extension|deadline|\
anything|anyone|results|potential|traffic|travel|welcome|attract|\
material|forensic)
* 45^2 SUBJECT ?? (for\ more|b2b|too\ much|for\ the\ price|your\
interest|credibility|your\ homepage|health(\ |\-|)care|is\ now\ live|PR\
package|learn\ how|order\ online|girls)
* 250^2 SUBJECT ??
(\<sex|\<xxx\>|porn|\<gay\>|erotic|orgy|\<hiv\>|\<aids\>|viagra|sperm|\<jiz|\<jism|\<cum\>|orgasm|lesbian|cum\
*shot|get\ it\ up|sex\ drive|lingerie|(adult|nude|live)\ (streaming\
|)(video|feed)|live\ *(show|chat|sex)|get\ off|(adult|date)\
(line|site)|over\ *21|adults\ *only|phone\ *sex)
* 90^2 SUBJECT ?? (hottest|bigger|harder|subliminal|everyday)
* 50^2 SUBJECT ??
(tattoo|ugliest|babe|wiggle|jiggle|\<tight\>|\<ass(hole)?\>|\<huge\>|\<tits\>|\<cock\>|\<wet\>|lust|\<farm\>|suck|swallow|choke|nuts)
* 200^2 SUBJECT ??
(aphrodisiac|pheromone|androstendione|androstenedione|dhea|sexual
power|steroid|enlargement|impotency|instant sex appeal)
* 200^2 SUBJECT ?? ((barely\
*legal|nude|wet|young|live|hot|shaved|hairless)*\
*(teen|pussy|cunt|slut|whore))
* 200^2 SUBJECT ?? ((attract\ (and|\&)\ seduce|pick(\-|\ )?up)\ women)
* 200^2 SUBJECT ?? (stronger\ ((and|\&)\ multiple\ )?orgasms)
* 200^2 SUBJECT ?? (dressing\ room|(hidden|voyeur)\
cam(|s|era)|grandmother.*fuck)
* 100^2 SUBJECT ??
(abduction|forced|unwilling|kidnapped|abused|rape|incest|violated)
* 50^2 SUBJECT ?? (toilet|beastiality|(animal|zoo)\ sex|golden\
showers|urine|\<pee\>)
* 50^2 SUBJECT ??
(mischievous|forbidden|outlawed|illegal|havoc|steal|drug|fraud)
* 45^1 SUBJECT ?? (zaprosz|zaproszen|oferta)
* 90^2 SUBJECT ?? (\<free\>|wholesale|today|plus|discount|clearance|gift)
* 60^1 SUBJECT ??
(priority|portal|compete|placement|immigration|attention|alert)
* 60^3 SUBJECT ?? (affiliate|referal|program)
* 100^2 SUBJECT ?? (<win\>|offshore|\<prize\>)
* 200^2 SUBJECT ?? (You\ Won\ \$)
* 200^2 SUBJECT ?? (casino|lotto|lottery|gambling|betting|playoff|beat\
the\ slots|slot\ *machine)
* 500^2 SUBJECT ?? (casino\ *software)
* 250^2 SUBJECT ?? (\<AD(V|)\>)
* 45^2 SUBJECT ?? (psychic)
* 90^2 SUBJECT ?? (advert|delete|bulk(\-|\ )*email|promotion|call\ now)
* 90^2 SUBJECT ??
(stock|investigat(or|ion|e)|secret|confidential|weapon|Internet\
Spy|background|password)
* 90^2 SUBJECT ?? (stock\ (tip|market|offer))
* 75^1 SUBJECT ??
(\<hot\>|pricing|expand|offer|exciting|revolutionary|important|information|unlimited|limited\
time|easiest|fantastic|ultimate|unlock|affordable|flat\ rate|universal)
* 50^3 SUBJECT ?? (\<save\>|\<slash\>|\%)
* 60^0 SUBJECT ?? (E\ N\ O\ U\ G\ H)
* 45^3 SUBJECT ?? (\<cell(ular|)\>|reception|range)
* 100^1 SUBJECT ?? (india|china|taiwan)
* 50^3 SUBJECT ?? (online|advertising)
* 75^1 SUBJECT ?? (buy(ing|)\ on(-|\ |)line|pre-registration|lowest\ price)
* 500^2 SUBJECT ?? (homebiz|ca\$h|zero\ down|Home(\-|\ )Based\
(Biz|business)\work\ (at|from)\ home|financial\ freedom|downline|mlm)
* 500^2 SUBJECT ?? (You\ Have\ Won|You\ Have\ Been\ Chosen|would\ you\
like\ to|don't\ want\ you\ to\ know|open\ this\ letter|change\ your\
life|\easy\ way)
* 200^2 SUBJECT ?? ((toner|printer)\ (supplies|cartridges))
* 100^2 SUBJECT ?? (accept(ing|)\ (credit|checks)|merchant\
account|toner|credit|get\ paid|pay\ you|mortgage)
* 75^3 SUBJECT ?? (targeted\ e(\-)*mail|campaign)
* 100^2 SUBJECT ?? (revenue|lifetime|guaranteed(\ (results|return))|growth\
potential)
* 90^2 SUBJECT ??
(special|sponsor|supplies|cash|improve|cost|increase|reciprocal|UNBELIEVABLE|savings)
* 90^4 SUBJECT ?? (\<invest(ment|or|ing|)\>|business|income|opportunity)
* 90^4 SUBJECT ?? (biz\ op|venture)
* 45^0 SUBJECT ?? (wealth|virtual|value)
* 75^2 SUBJECT ?? (Wait\ Is\ Over|voted \#1)
* 75^2 SUBJECT ?? ((easy|free|earn|extra|)\ *money|need\ cash)
* 75^2 SUBJECT ??
(revenue|expense|profit|\<earn\>|purchas(e|ing)|prospects|expert|powerful|recruiting|contact|survey|partner|positive)
* 200^1 SUBJECT ?? (residual|please\ read|can't\ lose|expand\ your|don\'t\
delete|free\ info|(truly|really)\ works|money\ making|traffic\ builder|(as\
|)seen\ on|Advertising\ that\ works)
* 100^1 SUBJECT ?? (The\ Contrarian|congratulation)
* 120^1 SUBJECT ?? (dollars|million|thousand|\.INFO|\.NAME|\<4\ *U\>)
* 200^1 SUBJECT ?? (make\ (lots\ of\ )*money|debt\ free|out\ of\
debt|great\ credit|credit\ history|pre\ paid\ legal|private\ (and|\&)\
confidential)
* 100^1 SUBJECT ?? (no\ cost|making\ money|email\ addresses|get\
yours|internet\ marketing)
* 250^1 SUBJECT ?? (Weight\ *Loss|lose\ *weight|non(\-|\
|)smoker|homeopathic|all\ natural)
* 150^1 SUBJECT ?? (global\ friends|domain\ extensions|keyword\
analysis|magazine\ subscription)
* 200^1 SUBJECT ?? ((urgent\ *(\&|and)|very)\ *confidential|web\
portal|within\ our\ portal|find\ out|\one\ of\ a\ kind|JUST\ RELEASED)
* 100^1 SUBJECT ?? (link\ exchange)
* 100^0.75 SUBJECT ?? (Platinum|eMarketing|e(-|)biz|sales\ (\&|and)\
marketing|easy\ (\&|and)\ safe|for\ your\ clients)
* 120^1 SUBJECT ?? (\,000|0000|\.00|\.95|\.99|\%\ off|mega|on\ *cd(-|)rom)
* 75^1 SUBJECT ?? (monthly|weekly|/mo|/yr|/wk|(per|every|each|paid)
(month|year|week|quarter))
* 250^1 SUBJECT ?? ([0-9]+\ *(cpm|(c/|cent(s|)\ *(/|per))\ *(min|mn)))
* 25^1 SUBJECT ?? (cent)
* 120^1 SUBJECT ?? (long\ distance|cable\ TV|satellite|\<dss\>|descrambler)
{
LOG="SPAM: Subject Scoring match $=$SPAMVER"
:0:
|gzip -9fc>>$MAILDIR/spam.gz
}
Is procmail case sensitive?
No. FTR, this is made quite clear in the manpage if you take the time to
read them.
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail