procmail
[Top] [All Lists]

Re: Need recipe to match ALL-NUMERIC addresses @whatever

1997-08-25 21:59:22
At 08:06 PM 8/25/97 -0700, The Doctor {Who?} wrote:

So, what I need to match is stuff like:

143671(_at_)msn(_dot_)com
03985932865(_at_)mci(_dot_)com
38957272(_at_)compuserve(_dot_)com

[not]
25463(_dot_)254(_at_)compuserve(_dot_)com
652-2546(_at_)mcimail(_dot_)com

The following is from my anti-spam rules.  I filter out ONLY 8-digit
numerics (popular in the current crop of junkmails).

# Okay, if the From contains an 8-digit numeric-only address, ditch it
# as spam (this seems to be a new popular spammage technique - an 8-digit
# random number).
:0: spam$LOCKEXT
* ^From:[  ][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9](_at_)(_dot_)*
|/bin/gzip -9fc>>$MAILDIR/spam.gz

(the [  ] is space+tab)


This filter, and the others like it are places above my
egrep-against-massive-domain-file recipes, so that if the message matches
against this simple (fast) regexp, I don't incur the overhead of the egrep
(slow).

modify the match line like so:

* ^From:[  ]*[0-9]+(_at_)(_dot_)*

And it should match on a variable number.

The following might work if passed to a newer egrep (but won't work
directly from procmail).  I offer this only because I'm not personally keen
on the idea of whacking all numeric addresses:

* ^From:[  ][0-9]{3,10}(_at_)(_dot_)*

Should match a series of digits from 3 to 10 digits in length.

The following should do about the same thing (and IS compatible with
procmail), but obviously requires a bit more typing on your part:

* ^From:[  
][0-9][0-9][0-9][0-9]?[0-9]?[0-9]?[0-9]?[0-9]?[0-9]?[0-9]?(_at_)(_dot_)*

Anyone is welcome to chime in any improvements or corrections to these
expressions.  Or even a confirmation that the interval regex feature does
or does not work.

Me - I'll be sticking to the exactly-8-digit-uid = spam as I have yet to
catch any spam that uses digits which is other than 8-digit.  Along with
X-UIDL, messageid, spam mailer, to/from, and domain rulsets (among other
things), this works acceptably well for me.

I should also point out that with the other rules I have in place, this one
rule hasn't specifically caught anything not already found by the various
other rules (I can test this by modifying the output file for that rule,
which appears after other non-address specific anti-spam rules, and piping
the gzipped spam mailbox into a formail->procmail to check it out).

---
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

 Sean B. Straw / Professional Software Engineering
 Post Box 2395 / San Rafael, CA  94912-2395