On Mon, 26 Feb 2007 Dan_Mitton(_at_)Notes(_dot_)YMP(_dot_)GOV wrote:
How are you generating your reputation numbers?
pygossip - implements the GOSSiP protocol. Tracks the last N (currently 1024)
messages from an ID (domain-spec:qualifier) as to ham or spam (or NULL) status.
Here is the reputation score code:
def reputation(self):
"""Compute reputation score."""
n = self.bcnt
if not n: return 0.0
#N = float(self.maxobs)
N = float(n)
k = 2
ham = self.hcnt
spam = n - ham - self.ncnt
log.info("ham: %d, spam: %d"%(ham, spam))
ph = ham / N
ps = spam / N
log.debug("P(h) = %f P(s) = %f"%(ph, ps))
num = math.exp(k * (ph - ps))
denom = 1 + math.exp(k * (ph - ps))
return 200 * ((num / denom) - 0.5)
When gossiping with peers about the reputation of an ID, peer scores
are aggregated as follows:
def aggregate(agg,offset=0):
"""Aggregate reputation and confidence scores.
>>> [round(x,1) for x in aggregate([(-76.159416,0.219053),(0,0)])]
[-76.1, 0.2]
"""
n = len(agg)
if n < 1: return (0.0,0,0)
if n == 1: return agg[0]
wavg,wcfi,wvar = weighted_stats(agg,offset)
if wvar <= 0: # only one non-zero cfi
return weighted_average([(rep,cfi) for rep,cfi in agg if cfi > 0])
stddev = math.sqrt(wvar * n / (n - 1)) # sample standard deviation
# remove outliers (more than 3 * stddev from mean) and return means
return weighted_average([(rep,cfi) for rep,cfi in agg
if abs(rep - wavg) <= 3*stddev],offset)
--
Stuart D. Gathman <stuart(_at_)bmsi(_dot_)com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.
-------
Sender Policy Framework: http://www.openspf.org/
Archives at http://archives.listbox.com/spf-discuss/current/
To unsubscribe, change your address, or temporarily deactivate your
subscription,
please go to http://v2.listbox.com/member/?list_id=735