spf-discuss
[Top] [All Lists]

Re: Turning raw data into useful stats

2005-06-27 22:32:10

On Mon, 27 Jun 2005, Greg Connor wrote:

Here is the scenario. My company receives about 3.5M email transactions per day. Majority of these are blocked by RBL, and other methods, and only about 7% are allowed past the first mailer (roughly 200K/day). But, I have other data that suggests the real, non-abusive email is closer to 20K/day, so I would really like to get our current 7% number down to less than 1%. Not an easy task.

1% is kind-of low even in current email situation with > 50% email being spam (average good email seems to be between 5% to 30% nowdays depending who you ask). Are you sure your numbers are right and if so, do you have any idea why your amounts of spam are so high?

My question to you fine folks is:

1. What sort of data would be most useful? For privacy reasons I cannot release the raw data showing who is emailing whom, but whatever calculations I can perform on the raw data to get summary numbers, I want to report if I can.

General statistics per ip block from source for email which did not pass spf might be useful if that is not considered to be privacy issue.

Other then that try to collect data on what percentage of emails that passed
other filters would have passed spf and which would not. And similarly for
those that were rejected by filters for other reasons (interesting would be
list of domains rejected by other filters but that passed spf - but I suspect
you may not be able to release that).

For all statistics if possible separate to not just pass/fail but into
pass/neutral/softfail/hardfail.

Also if you can do tests with HELO and list percentage of incoming mail
where client mailserver helo name could be verified that would be good too.

--
William Leibzon
Elan Networks
william(_at_)elan(_dot_)net


<Prev in Thread] Current Thread [Next in Thread>