At 2:34 PM -0500 4/1/03, bukys(_at_)cs(_dot_)rochester(_dot_)edu wrote:
Need HAM:
After we settle on definitions, the main missing ingredient is a good
HAM corpus attached to similarly-sampled SPAM. Multi-language ham is
especially needed (I know SpamAssassin team has issued a call for it,
don't know if it will arrive.)
Just a point. This process works for testing content classification
systems. It's lousy for header analysis. In the first place,
archives almost always elide the recipient's address--which we need
to examine the email's path throught the network. And secondly, the
older the message, the more inaccurate any of the IP address
information is.
--
Kee Hinckley
http://www.messagefire.com/ Junk-Free Email Filtering
http://commons.somewhere.com/buzz/ Writings on Technology and Society
I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg