spf-discuss
[Top] [All Lists]

Re: How to use SPF to reject spam

2005-04-06 07:06:12
Commerco WebMaster wrote:
Mr. Hociung,

Very nice work - a fairly long but entirely worthy read.

Thank you.

At some point it would be interesting to form an aggregated list of scores from various individual reputation scoring data, in a way similar to the way the DShield.Org system works aggregating data on IP problems.

Absolutely. It will be very intereting to compare the various databases, and especially to analyze the mismatched stats. (ie, why does "service A" think that "opportunities.com" is a good-guy, when "service B" thinks otherwise)?

As regards, your remark "the main differentiator of these services will be who they use as rating agents. I'd be perfectly happy with a service that uses cisco.com's feedback, but I don't care for one where Dick and Harry can have a say in." - I have no qualms with any data source, as long as the data source is disclosed and weighted by appropriate metrics.

If a source is unreliable, a data aggregator should adjust their individual metrics to show this and bring that source's net influence in the aggregated trust scoring down. Done properly, Dick and Harry could (and should) have a say if they are legitimate, but would have a near zero say if they are not. It is certainly not to say that I don't trust Cisco (I worked with one of their senior VPs and the wife of another before there ever was a Cisco - they are great professional people and we love their products around here), but I think that any trust system should be open to all parties (not just the biggest ones), weighting each initially the same and allowing the data from each source to determine future weightings given to any given data source.

Well, I thought that having a short list of trusted agents would be appropriate for three reasons:

1. It is the easiest way (therefore prefered when possible)

2. Agents with a well-known and generally known as reputable brand name are a worthwhile feature for the service. If a service lists small shops, it is likely that potential clients will think "shady" and prefer another service with an obviously more serious list.

3. It will be desireable for the list of agents to be very stable in time. If it changes on a weekly basis (as it would when smaller players are used), it will become questionable. Also, when a small player is removed from the list, the database can be consired tainted, using the assumption that the small player was removed for falsifying info. (technically you can avoid this by keeping separate databases for each reporting agent, and only publishing the aggregate results to the world, but the PR issues may be a lot more difficult to overcome).

Also, I think you may want to use the feedback of a mail operator that deals with tons of email, such that they have a significant sampling. A small shop does not have to opportunity for a significant sampling. For instance, if in a company of 10, one employee signs up to a small mailing list for jokes run by someone at hotmail. Later he figures that the jokes are too offensive, and clicks the 'this is spam' button. Eventually he will find all jokes effensive, and filter the hotmail stuff altogether. The problem is that suddenly this outfit's opinion of hotmail is 100% spam. This is only due to an insignificant sample size. On the other hand, there are probably at least 3000 employees with hotmail friends. If 300 of those think of their hotmail friends as idiots, the spam figure that cisco will report is 99% ham, 1% spam, a much more significant report, given the much larger sample size.

Also note two important side-effects:

1. If cisco were to be used as a trusted reporting agent, all existent email lists that pro spammers use will be instantly cleaned to remove cisco destinations. This is a great incentive for a company like cisco to participate in, as it will save them lots of money that they currently spend on spam-related infrastructure. They will still have to maintain some, but the volume of spam they will have to handle will be noticeably smaller.

2. Due to the effect above, the reputation database will automatically be skewed, as it will no longer contain stats on new spammer domain names.

Ideally, #2 can be avoided if the service requires the reporting agent (Cisco in this case) to operate a honey-pot, and include it's results in the reports. In light of the #1 side effect, this means "spend a little to save a lot". The names of the honeypotted results need not be published, but their operation should not be delegated unless to mail operator of equal reputation to Cisco's (or higher if it is possible).

Radu.


<Prev in Thread] Current Thread [Next in Thread>