procmail
[Top] [All Lists]

Re: any good patterns for catching 419 spam?

2003-06-07 02:46:18
On Fri, Jun 06, 2003 at 07:17:39PM -0700, Andrew Edelstein wrote:

On Fri, Jun 06, 2003 at 04:53:04PM -0500, David W. Tamkin wrote:

The volume of 419 spam I'm getting has suddenly skyrocketed, and
I'm sick of deleting it manually.  Does anyone have a good way for
procmail to spot it?  Wordings vary, and some versions are rife with
misspellings.

Any of the decent anti-spam programs out there will catch
it. Spamassassin does a nice job, as I recall Spambouncer handled them
fine when I was still using it and I'm sure she's improved that recipe
since then.

Yes, I catch almost all of it with my headers-only traps that catch all
other spam, too.  I have a one-in-a-thousand false-negative rate (spam
hitting my inbox), so I must be doing something right.

I do have a small fallback body check for odd cases when the header
checks don't catch something but the smell-test $TRUST factor is still
low.  That small percentage of mail also gets shunted through SpamAssassin.
Occasionally, the 419s get that far; but no further.  I see from my logs
that SA ran three times in the last week.  I got and caught about 1,750
spams in the same period.  I actually had two false negs this week, my
worst showing in that category in a couple of months.  (And there were
two false pozzes.)  The FNs breached a heretofore-undiscerned logic
problem in my read-receipts accept-engine, now fixed.

My body-check fallback for 419s is not very well-developed, because I've
needed it so rarely.  But I'm of the view that less is better than more.
A computer shouldn't have to check every conceivable thing I as a human
can think of when I see such mail.  It should only have to check what works.
That said, I'm almost embarrassed by how ugly these are.  I just haven't
had the time or need to rework them.

 :0 B $E  # 030508 () Nigeria scam
  * $       MYTAG  ??  $FALSE
  *    6^0  ()\<US ?(D(ollars?)?|\>).*\$?[0-9]+((,000)+(\.00)?|.[0-9]+)(M|\>).* 
()
  *   -6^0  (budget|military)
  *    6^0  ()\<sum of US\$[0-9]+\>?M
  *    3^0  ()\$[0-9][0-9] ?M(.*$)+.*\<encumb
  {
     SPAMISH = $=
     RX = "${RX:+$RX, }UBE.B.SPAMISH:B=$SPAMISH"
  }

 :0 B $E D  # 030508 () another swipe at Nigeria scam
  * 6^0  ()\(MR([S. ]|$)
  {
     SPAMISH = $=
     RX = "${RX:+$RX, }UBE.B.SPAMISH:B=$SPAMISH"
  }


As I said, not highly developed, as mostly not needed.  The spam gets
caught by the header checks in most every case.

Note that I discount "budget|military"; I once had a false pozz
forwarding myself a Usenet post about millions of dollars of
military spending.  :-)  The 419s don't seem ever to use those
two quoted words, though.

-- 
dman

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>