ietf-dkim
[Top] [All Lists]

[ietf-dkim] Bayesian filters are the pits

2006-08-22 13:00:42
I have been looking through some of the responses from the spamming community 
to SPF. I conclude that the real problem here is that people using naive 
Bayesian filtering don't have a clue. 
 
The problem with the Bayesisan approach is that it is very vulnerable to 
counter-programming by spammers. So when SPF started to gain traction the 
spammers realized that they could deter adoption of SPF by simply introducing 
SPF data into their systems. After a short while the naive Bayesian schemes 
would quickly generate a large negative score for having SPF data present. 
 
The solution to this problem is mostly marketting communications rather than 
technical.
 
 
First we need to get across the fact that spam filtering companies do not in 
general use the naive Bayesian filtering approaches popularized by Paul Graham 
and promoted by the conference at MIT. 
 
Second we need a simple fix to deter the 'jamming' attack by spammers. The 
simple fix here is to simply have a rule that says that certain featues can 
never result in negative scores. SPF/Sender-ID and DKIM should be amongst them. 
 
Third we need to promote the idea that you should not look for the existence or 
even the validity of a DKIM header as being as important as the domain that is 
claiming responsibility. If you can't correlate the domain to some form of 
additional information you should ignore the record entirely.
 
 
 
_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html