spf-discuss
[Top] [All Lists]

Re: Distributed reputation system; GOSSIP

2004-06-23 15:36:33
On Wed, Jun 23, 2004 at 02:02:03PM -0400, Meng Weng Wong wrote:
On Tue, Jun 22, 2004 at 02:48:54PM -0700, Mark C. Langston wrote:
| Given the recent discussion of reputation systems, I've decided to
| finally start working on my distributed reputation system in earnest.
| 
| The working title for the project is: Gossip Optimized for
| Selective Spam Prevention (GOSSiP).
| 
| There's a website that details the current concepts at
| http://sufficiently-advanced.net 

Congrats and thanks for taking the initiative to work on this.

What does the GOSSIP response to the SMTP server look like?

Is it a single number, or does it contain richer
information?

As a client, when I query GOSSIP with the parameters

  (address=XXX(_at_)YYY, domain=YYY, 
context=(helo|ptr|return-path|pra|from|other))

I would prefer to get back from the GOSSIP black-box a set
of key/value pairs indicating

 - the total number of messages observed by GOSSIP,
 - the total number of spam complaints received by GOSSIP,
 - the age of the domain in question
 - the observation profile of the domain
 - overall "risk score" or "credit rating"

and a bunch of other stuff that may depend on the identity
context.


That's a good question.  Right now, the answer's open for debate.
Initially, I was imagining the response being a single number, but that
was mainly due to initial implementation constraints (i.e., I wanted to
get something done quickly).  There's no reason I can see (other than
complexity) that would prevent something along the lines of what you
describe.

My initial conceptualization had the number of messages and complaints
being a hidden component of the "risk score" as you call it, computed by
the standalong GOSSiP server, the idea being that you'd offload that sort 
of computation to a system dedicated to doing it (unless you ran the 
GOSSiP server on your MX).  You'd consult multiple GOSSiP servers for
each identity, and you'd modulate their responses according to how much
you trust each of the other GOSSiP servers.

So, you've got two distinct trust relationships:

1)  An individual GOSSiP server's level of trust of a given identity,
and

2) An individual GOSSiP server's trust of another GOSSiP server.

It's the interplay of GOSSiP servers, and the credibility/trust
relationships between them (i.e., the "gossip" and how much you believe
it) that makes the approach unique and gives GOSSiP its power.  Rather
than trusting a single reputation server, you have your own estimation
of reputation, several other's estimations of the same identity's
reputation, and your level of trust in each of the other system's
opinions.


By the way, you asked an interesting question.  Would you mind if I
forwarded your post along to the GOSSiP mailing list? (see
https://secure.roadtoad.net/mailman/listinfo/gossip if you're interested
in subscribing)

Thanks!

-- 
Mark C. Langston                                    Sr. Unix SysAdmin
mark(_at_)bitshift(_dot_)org                                       
mark(_at_)seti(_dot_)org
Systems & Network Admin                                SETI Institute
http://bitshift.org                               http://www.seti.org