procmail
[Top] [All Lists]

Re: One-Word-Spam on nearly all Mailinglists

2007-04-30 04:36:52
On Mon, Apr 30, 2007 at 12:41:18PM +0200, Ruud H.G. van Tol wrote:

Dallman Ross schreef:

                :0 B:
                * $! [^$WS][$WS]+[^$WS]
                ONEWORD

How
about
a 
funky
body 
that
looks 
like 
this 
?

--<<<--==+X[2266df7a330c921e0d416be1cc667191]
Content-Type:text/plain;charset="iso-8859-1";format=flowed
Content-Transfer-Encoding:7bit
[snip similar]

I already explicitly acknowledged those would be caught, Ruud.
It's not a surprise vis-a-vis my coding intentions.  Michelle
doesn't want spam.  I would argue that 99.999% of any such body
content will be spam.  It's a lagniappe, in other words.

Last night I wrote (when I added in the missing "NOT"):

 I didn't bother to add line breaks, because if
 a message is just one word (or none) per line
 over more than one lines and then ends, it's also
 highly suspect.

That is exactly what you just showed.

And this morning I added in followup:

 One other thing to realize about this, which I thought of last
 night but didn't mention, is that it would also capture some
 non-Western-charset messages that have only one part (i.e., not
 multipart messages).  If they don't have any spaces in the body,
 they will be caught.  Running the recipe against a couple hundred
 of my most recent spam messages, I catch three Japanese-charset   
 messages.  Depending on what you want, that might or might not be
 considered a boon or lagniappe.  To avoid that, well, put an
 appropriate limiting condition in the recipe.
 

So I anticipated your "objection" already, in other words.  If the
person using the algorithm knows what it will do and wants just
that result, then it's not a problem.  Of course, you are right
to show that one needs to think about other consequences to an
algorithm, whether intended or unintended. :-)

Dallman

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail