ietf-asrg
[Top] [All Lists]

Re: [Asrg] Comments on draft-church-dnsbl-harmful-01.txt

2006-03-30 12:03:23
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Thomas wrote:

Does there not exist something like, oh say, BLreports that
judges you on false positive/false negative, coverage, timeliness,
etc?

In a general sense, large enough to be statistically significant with
the sort of accuracy required for traditional research rigor?

No.  Unfortunately.  There are too many variables (eg: what actually
_is_ a FP in a given environment), and correlating what is _REALLY_ HAM
vs. SPAM in a large enough environment to be significant is impractical.

["Spam" and "Ham" corpuses more than a day or two old are entirely
useless, no matter how big they are.  Yesterday's post-facto analysis
(eg: training on spam) says _nothing_ about how well a particular
technique is going to work in real-time on tomorrow's spam.

Vendor: "In tests performed yesterday, our uber-fantastic spam filter
caught today 99% of spam [from a data set it was trained on from several
years ago]".

Me: "Something must be broken if it couldn't catch 100% of the spam
you've already trained it on".]

The best you can do is approximations that rely on various assumptions.

For example, in our environment with approximately a million emails per
day, approximately 75% of all spam is caught by one DNSBL, with a .01%
"would be FP if we didn't whitelist" rate.

But that carries with it various assumptions.  Eg: what the volume of
spam really is (which we get by inference from another complex metric),
and that our FP handling process catches most of the FPs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQCVAwUBRCwnFJ3FmCyJjHfhAQL8bgP9FWU1f8Ppwj3k4U8zll2LYcMrEiNVlfQT
1+BWdG2N0OaTEd58BhaBPve2fDVRL0D9SmdIlZ7havdyBKf2hXWE47uGk2GowySs
xrb7TT4p0h4m2ox1UQH6a5xkK46nrfVrfQItol3PVZPdov0QZiaeJDHS8545/XYn
8Fw4zsXx2Z0=
=Qnf7
-----END PGP SIGNATURE-----

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg

<Prev in Thread] Current Thread [Next in Thread>