ietf-asrg
[Top] [All Lists]

Re: [Asrg] Filtering spam by detecting 'anti-Bayesian' elements?

2004-09-21 08:19:47
On Tue, Sep 21, 2004 at 07:08:37PM +1000, Laird Breyer wrote:
Heh, you're right of course. Moreover, Markus, to whom I replied, lives
in Germany according to his sig. Double whammy ;-)

:-)
For really a lot of our customers that have <5% email communication in
English spamassassin - after some training - works like a charm with
overly high success rates (and then with - from my feeling - overly high
false positive rate if a message in English comes through).
However we're seeing a raise of German language spam lately so the
rates may change rather soon.

In the short term yes, but in the long term (ie with training), the
footer is recognized. No miracles. As a general rule, tokens which
occur commonly in both ham and spam, have little effect on a filtering
decision (Bayesian algorithms can vary). The decisions depend much
more on the presence of extreme tokens which (statistically) only
occur in spam, or only occur in ham (that's what I mean by extreme
here). It's very hard for spammers to discover which tokens are
extreme for any given individual.

Yes, but with all the zillions of variations the spammers are IMHO also
attacking the size of the databases.
I wasn't too good in statistics but IMHO it is getting harder for
statistical filters if the set of data the decisions are based on is
growing, even more if the set of tokens in a test have a lot of "no
decision" and only few decison making tokens. This still helps to
classify, but instead of 70:30 decisons we come close to a *lot* more
50.1:49.1 decisions which has the effect of growing false positive rates.
So I think the mere backdraw is not that more spam is coming through
but that the false positive rate is growing.

        \Maex

-- 
SpaceNet AG            | Joseph-Dollinger-Bogen 14 | Fon: +49 (89) 32356-0
Research & Development |       D-80807 Muenchen    | Fax: +49 (89) 32356-299
"The security, stability and reliability of a computer system is reciprocally
 proportional to the amount of vacuity between the ears of the admin"

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg


<Prev in Thread] Current Thread [Next in Thread>