Dave CROCKER wrote:
The fact that someone hit TIS means that -- independent of whether
it is actually something that might be called spam -- the message
irritated the user. Every single TIS click has a 100% confidence
factor, in terms of being a valid count of being problematic to the
end-user. (I'll quickly acknowledge that we have a derivative issue
from the fact that a given user is inconsistent and what is irritating
to me this morning might not be irritating this afternoon; but we have
plenty to consider by just looking at first-order issues.)
I think as was already pointed out, 100% is not something that
humans do well in any context. That is, mistakes happen. And it's
probably worth pointing out again that TIS and actual spam are
only loosely correlated. People who complain about the
human evaluators (not you) being imperfect spam judges are
missing the larger point of what _they_ see that button as.
In contrast, perhaps you take the 'good' number from something like
"no one complained". There can be lots of reasons no one complained,
only some of which are due to a message's being "good". So our
confidence in the aggregate measure of goodness needs to be much less
than 100%.
So, how do we factor in differential confidence levels in the final
assessment?
I've often wondered whether you could do something with timing
the longevity of things in people's inboxes as first order
approximation of "value". For example: if the timing between me
seeing a piece of mail, and me hitting the delete button is
consistently short, it's probably an indication that I'm not
very interested in it. Maybe not enough to killfile them, but
it probably would yield some clues as to how *I* prioritize some
traffic over other traffic.
As you allude to, this is clearly a dynamic system too: my interest
in some topics is situational, and clearly changes over time. It
is also context based: even actors that I rarely read may be
contributing to a subject that I'm very interested, etc, etc.
What this really points to, IMO, is that the entire way that mail
-- and the many other emerging or established media -- are presented,
prioritized, alerted, etc are pretty well borked. Spam is just one
small -- but important -- part of that problem. But even if spam were
solved through a divine act tomorrow, it would not address the ever
increasing fire hose that we're demanded to drink from.
Mike
_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
https://www.irtf.org/mailman/listinfo/asrg