On 05/12/2008 02:11, "Chris Lewis" <clewis(_at_)nortel(_dot_)com> wrote:
I could replace $prod with $traffic in all the denominators, but you
can't get content or complaint hits on the traps. So that just dilutes
the score for very high trap hits.
Remember, I'm not going for pedantic purity. I'm going for a function
that has lots of head room and catches the worst offenders.
Yep, and that's all anyone can do, given the ever-changing nature of spam.
Other reputation systems I've worked on or learned about also mix in other
factors, such as:
- "not spam" votes (which aren't weighted 1:1 with spam votes)
- reporter reputation (not all users' votes get the same weight)
- past behavior of the IP (change thresholds based on recent behavior)
- type of mailstream (different thresholds for gmail vs. sears)
Asrg mailing list