spf-discuss
[Top] [All Lists]

Re: Turning raw data into useful stats

2005-06-28 08:19:41
At 10:15 PM 6/27/2005 -0700, Greg Connor wrote:

I am in possession of a large amount of raw data, and I would like to turn it into something useful.

My goals are:

1. Gather some statistics about how spam is currently being handled.
2. Evaluate whether using SPF would help. I would like to start using SPF to reject incoming email, but first I need to show management that we have a reasonable idea of what will happen, and we have identified forwarding sources that should be whitelisted. 3. Provide real, useful data back to other interested parties regarding how well SPF works (or might work, if applied to our incoming mail).

Here is the scenario. My company receives about 3.5M email transactions per day. Majority of these are blocked by RBL, and other methods, and only about 7% are allowed past the first mailer (roughly 200K/day). But, I have other data that suggests the real, non-abusive email is closer to 20K/day, so I would really like to get our current 7% number down to less than 1%. Not an easy task.

The edge mailers are not smart enough to process SPF yet. (Actually an SPF switch exists but their implementation is known to have some problems and can't be adjusted, whitelisted, etc. This is an appliance box.) Most important, their implementation of SPF doesn't allow for logging only, the only choice is to reject.

This raises a serious question - If many domains use these "appliance boxes" as their border MTAs, how can we expect *any* IP authentication method to work? Are we expecting these appliances to be replaced by general-purpose MTAs? I assume there is no chance of modifying their proprietary software.

...

1. What sort of data would be most useful? For privacy reasons I cannot release the raw data showing who is emailing whom, but whatever calculations I can perform on the raw data to get summary numbers, I want to report if I can.

I would like to see a rough estimate of the "market share" of various MTAs (sendmail, postfix, qmail, exim, exchange, etc.) That should give us a good idea of how many MTAs need to be updated in a rollout, if you want to cover X% of the population.

--
Dave
************************************************************     *
* David MacQuigg, PhD     email: david_macquigg at yahoo.com     *  *
* IC Design Engineer            phone:  USA 520-721-4583      *  *  *
* Analog Design Methodologies                                 *  *  *
*                                 9320 East Mikelyn Lane       * * *
* VRS Consulting, P.C.            Tucson, Arizona 85710          *
************************************************************     *



<Prev in Thread] Current Thread [Next in Thread>