On Dec 4, 2008, at 8:31 AM, Dave CROCKER wrote:
Chris Lewis wrote:
Do we block an IP on one TIS hit? No. We compute good/bad ratios
have heuristics on when its high enough to do something about.
The "bad" number is affirmative. People hit TIS. As a measure, the
bad number therefore has a 100% confidence level of accuracy (as
long as we are careful about defining badness.)
But where do you get the 'good' number from and is it really equally
If the basis for obtaining each number is conceptually different --
ie, if the two different numbers really do warrant differential
faith in their accuracy, then how do you balance between them?
In case I'm not being clear:
The fact that someone hit TIS means that -- independent of
whether it is actually something that might be called spam -- the
message irritated the user. Every single TIS click has a 100%
confidence factor, in terms of being a valid count of being
problematic to the end-user. (I'll quickly acknowledge that we have
a derivative issue from the fact that a given user is inconsistent
and what is irritating to me this morning might not be irritating
this afternoon; but we have plenty to consider by just looking at
A common approach is to select lots of messages, then hit the TiS
button. The recipient may not have read, or even seen, any given mail
in that group.
In contrast, perhaps you take the 'good' number from something
like "no one complained". There can be lots of reasons no one
complained, only some of which are due to a message's being "good".
So our confidence in the aggregate measure of goodness needs to be
much less than 100%.
If an email is moved from the junk folder to the inbox that's usually
treated as a significant good. There are UI approaches in place that
discourage using the junk folder as just another mailbox in order to
encourage that behaviour.
There are also a number of other measures in use, which I'm not going
to go into on a public mailing list.
So, how do we factor in differential confidence levels in the final
We don't. The ISPs running the particular system do so, in an adaptive
manner, in order to optimize the comfort of their customers.
Asrg mailing list