spf-discuss
[Top] [All Lists]

Re: Distributed reputation system; GOSSIP

2004-06-23 16:20:17
On Thu, Jun 24, 2004 at 12:09:01AM +0100, Shevek wrote:

The notes at http://sufficiently-advanced.net/reputation-system.html are 
good, but at least the first three criteria for a good host:
  # relatively low volume
  # low total recipients per mail
  # relatively small mail size
are highly contentious.

In fairness, I don't view those as valid anymore.  As stated on the main
page, those three documents are more than a year old, and the thinking's
evolved quite a bit since then.  I'd also meant those to be examples of
what one might have defined as good criteria; I'd originally meant for
the individual admin to be able to set their own criteria for assessing
identities.  This was also at the time that I was thinking that the
main reputation would be assigned to the MX for the domain.


In response to freeside's comment: I agree, with the caveat that the
computation (which must be performed by every MTA over the raw data) is
not expensive. These computations _are_ frequently expensive (consider
sa-learn), which suggests that some of the computation, at least, should
be performed by the reputation service and maintained as persistent
derived data.

If you are to run your own GOSSIP server for a cluster of MTAs (not
generally true for smaller users, but certainly for organisations), then
that server can handle much of the computation and feed raw figures to the
MTA. This permits per-organisation customisation of the computation while 
reducing MTA load. I have to wonder how many organisations are really 
going to care, though. Configuring reputation servers isn't their core 
business.

True, but by the same token, configuring spam filters isn't their core
business either, yet many find themselves doing exactly that.  It's why
many businesses have dedicated IT staff.


It may also not be to the advantage of the reputation service to expose
data which might be used for gaming that service. Compare this to the
search engine ranking algorithms, which are frequently gamed, even though
there is perhaps less value from gaming them. This is a hard problem.

Agreed.  I was toying with (and still am) the idea of having
certificate-based connections among peers, such that you can't just
leech/feed data from a GOSSiP server without first dealing with a 
certificate exchange.  Though ultimately, I'd imagine both types of
peers existing:  those that authenticate their peerwise connections, and
those that don't.  As one might expect, the quality of information from
the authenticated peers will be higher than that from the
unauthenticated peers.


-- 
Mark C. Langston                                    Sr. Unix SysAdmin
mark(_at_)bitshift(_dot_)org                                       
mark(_at_)seti(_dot_)org
Systems & Network Admin                                SETI Institute
http://bitshift.org                               http://www.seti.org