ietf-asrg
[Top] [All Lists]

Re: [Asrg] Filtering spam by detecting 'anti-Bayesian' elements?

2004-09-17 18:32:07
On Sep 17 2004, Jim Witte wrote:
   Has anyone tried making a partial spam filter by scanning messages 
for the non-sense words they put in to try to confuse the Bayesian 
filters?

Those nonsense rules backfire with Bayesian filters. Since nonsense
words don't occur in legitimate messages (how is the spammer going to
force people to add such words?), all such words tip the balance
towards spam. When a filter looks up the words to see if they exist in
its database, then either the word is completely new or it has already
occurred in spam.

It's very hard for a spammer to create new nonsense words, all the
time.  After a while, some spammer (not necessarily the same) has
already used that nonsense word, and the filter knows about it.

I don't know of any open source statistical filter who has serious
trouble with nonsense words over time. They stick out like a sore
thumb. It's likely that the real purpose for including them is to
evade hash/signature filters.


-- 
Laird Breyer.

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg