procmail
[Top] [All Lists]

Re: Trapping non-standard URL's in spam

1999-03-21 04:06:06
On Sun, 21 Mar 1999 10:15:24 GMT, waltdnes(_at_)interlog(_dot_)com (Walter Dnes)
wrote:
 NONSTANDARD="(0x[0-9a-f]+|0[0-7]+)"
 :0fb
 *  1^0 http:(//|//.*@)[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
 *  1^0 
http:(//|//.*@)0x[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]
 *$ 1^0 http:(//|//.*@)${NONSTANDARD}\..*\..*\..*
 *$ 1^0 http:(//|//.*@).*\.${NONSTANDARD}\..*\..*
 *$ 1^0 http:(//|//.*@).*\..*\.${NONSTANDARD}\..*
 *$ 1^0 http:(//|//.*@).*\..*\..*\.${NONSTANDARD}
 | formail -A "X-Reject: Non-standard URL format; often used by spammers"

If you are using scoring as just a convenient way to do OR, you can
achieve "shortcut" functionality by using a large number instead of 1.
The canonical large number in this context seems to be 9876543210.

3) through 6) Check for non-standard (non-base-10) notation
   in each of the quads separately, just in case spammers
   try to mix-n-match.

The octal check is somewhat likely to accidentally trigger on
uninteresting leading zeros. I've seen instances of IP addresses where
.0. is written .00. for some obscure reason. (I didn't even realize
this was an "octal zero" ... perhaps the app which does this doesn't
know either, in which case you could expect interesting failures for
the numbers 08 and 09 :-)

/* era */

-- 
.obBotBait: It shouldn't even matter whether     <http://www.iki.fi/era/>
I am a resident of the state of Washington. <http://members.xoom.com/procmail/>
 * Sign the European spam petition! <http://www.politik-digital.de/spam/en/> *

<Prev in Thread] Current Thread [Next in Thread>