ietf-asrg
[Top] [All Lists]

RE: [Asrg] 7. Best Practices - DNSBLs - Article

2003-08-13 08:22:36


-----Original Message-----
From: Brad Knowles [mailto:brad(_dot_)knowles(_at_)skynet(_dot_)be] 


      I am curious -- is there a reason why you tested with a much 
larger spam archive than your ham archive?


In general, it is difficult to gather a large public ham archive. You
personally have a large ham archive, but chances are that you're not willing
to post it on a public ftp site.

This is a common problem. The question is how can you perform the analysis
in the face of such conditions. There are a few of routes: 

1) Continue to try to persuade people to provide their ham. You end up
getting a subset of the available ham, because people hand-pick stuff.
However, perhaps with a wide enough campaign, you're able to collect a
sufficient amount of information.
2) Provide the analysis tools to the users to run directly on their
mailboxes eliminating the need for collecting their ham.
3) Create tools that sufficiently anonymize the ham so that people are
comfortable submitting it. There is similar work being done in other areas
of networking. One of the difficulties here is anonymizing the data while
preserving the relevant relationships within the data.

Each of these paths is worth pursuing. Anyone out there currently looking
into these or interested in doing so?

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg