Re: Distributed reputation system; GOSSIP

On Wed, 23 Jun 2004, Meng Weng Wong wrote:

On Wed, Jun 23, 2004 at 03:36:33PM -0700, Mark C. Langston wrote:
| 
| My initial conceptualization had the number of messages and complaints
| being a hidden component of the "risk score" as you call it, computed by
| the standalong GOSSiP server, the idea being that you'd offload that sort 
| of computation to a system dedicated to doing it (unless you ran the 
| GOSSiP server on your MX).  You'd consult multiple GOSSiP servers for
| each identity, and you'd modulate their responses according to how much
| you trust each of the other GOSSiP servers.

Hiding things inside a score leads to guessing games.
Where possible, please provide all the input data and let
each receiver make up their own mind.  The more data the
merrier.  Dumbing things down into a single variable is
useful, but so is providing the entire data set to allow
special case logic.


Surely the essence of the problem is how to generate a reputation value
from some raw data which is available to the reputation server.

My initial reaction was that GOSSIP is an interesting proposal for the
communication or data sharing problem, but does not address the essential
problem of reputation management in any significant way.

The notes at http://sufficiently-advanced.net/reputation-system.html are 
good, but at least the first three criteria for a good host:
  # relatively low volume
  # low total recipients per mail
  # relatively small mail size
are highly contentious. It is also not made clear how to combine these
criteria in order to get a good reputation.

In response to freeside's comment: I agree, with the caveat that the
computation (which must be performed by every MTA over the raw data) is
not expensive. These computations _are_ frequently expensive (consider
sa-learn), which suggests that some of the computation, at least, should
be performed by the reputation service and maintained as persistent
derived data.

If you are to run your own GOSSIP server for a cluster of MTAs (not
generally true for smaller users, but certainly for organisations), then
that server can handle much of the computation and feed raw figures to the
MTA. This permits per-organisation customisation of the computation while 
reducing MTA load. I have to wonder how many organisations are really 
going to care, though. Configuring reputation servers isn't their core 
business.

It may also not be to the advantage of the reputation service to expose
data which might be used for gaming that service. Compare this to the
search engine ranking algorithms, which are frequently gamed, even though
there is perhaps less value from gaming them. This is a hard problem.

S.

-- 
Shevek                                    http://www.anarres.org/
SRS for the next generation               http://www.libsrs2.org/