ietf-asrg
[Top] [All Lists]

Re: [Asrg] Comments on draft-church-dnsbl-harmful-01.txt

2006-04-03 08:51:30

On Mar 30, 2006, at 2:27 PM, Daniel Feenberg wrote:

A better study of false positives would require a large corpus of known good mail for a diverse set of destinations, with connecting MTA IP addresses. One could query the DNSBLs for those IP addresses, and calculate the probability that a legitimate message would be blocked. But I haven't found a corpus of known good mail. One source would be email confirmations of mailing-list signups, if anyone would like to share that with me. The saved mail file of an individual isn't very representative even if it is large.

Looking only at mailing list signups won't necessarily be a representative sample because mailing lists will tend to be clustered and possibly run on different servers from general email.

Tracking replies based on references headers and linking those to the record of the received mail should give a better overall picture of your users good mail sources. You would of corse need to filter out replies to abuse desks, vacation auto responders and forwarding accounts. There may be other anomalies in you mail pattern so some random sampling should be done to look for anything that might skew the results.

The overlap between the good mail sources and blacklisted sources is where most of the false positives are going to be. The magnitude of this overlap will be good enough for the first order estimate of the false positive potential.

The set of mail sources that are not identified as good in the above tracking and not identified as bad in the blacklists would be something to investigate further.



_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg