3) Proposals must have been tested on real world data
Like 2), there is *still* zero published data on how effective the PRA
algorithm is on real world data. I know that it would create a false
positive on the only email account that I have forwarded to my main
email address. I know that it will break on many mailing lists. I
don't know much more than that.
How about trying the information I was lent by the IMS Users mailing list?
There's over a year and a half of real world data on real and fake sender
domains, HELO identities, message sizes, IP addresses and so on. I'll bet
others can come up with similar logs.
Sure there isn't any DOR/Submitter information or headers or whatever. Some
of this could be synthesized for the purposes of testing, like setting some
bogus rule like, "if [host] has [domain] in part of its name, treat it like
[something]." It would be enough to do inital testing before putting it on a
live server.
I'll yank out the Access database and leave the raw plain text log in a
separate archive. I can provide separate domain and host information in a
different archive if it's difficult to separate it all out.
http://www.pan-am.ca/smtp.zip
In addition, the Win2K event sink shell I've asked to have coded up is being
worked on. The event sink itself will call an external library, which
actually implements MARID.
--
PGP key (0x0AFA039E):
<http://www.pan-am.ca/consulting(_at_)pan-am(_dot_)ca(_dot_)asc>
Sometimes it's hard to tell where the game ends and where reality bites,
er, begins. <http://vmyths.com/resource.cfm?id=50&page=1>