Re: The Value of Reputation (was Re: [ietf-dkim] Re: WG Review: Domain K

Nathaniel Borenstein <nsb(_at_)guppylake(_dot_)com> wrote:

On Dec 24, 2005, at 4:09 PM, Douglas Otis wrote:

Reputation remains the only solution able to abate the bulk of abuse.


... I think most of us pretty much agree about the critical role of
reputation.


   I've noticed a lot of what I call "lip service" about the critical
role of reputation. To say this differently, many folks seem to think
you can choose a "reputation system" almost at random, and it's sure
to improve your signal/noise ratio, "unless you've chosen the wrong one".
(which, I suppose, is a tautology...)

   But, in my view, we have no basis to choose the "right" one unless
we have a good understanding of what it measures and a workable idea
of how to "end run" when it falsely rejects good messages.

I see the cycle as going like this:  We need at least one
standardized, moderately-useful system for weakly authenticating
the sources of messages.  Once we have that, we have the minimal
data that a reputation system will require to be able to start
doing something at least mildly useful.


   A lot depends on what we mean by "weakly authenticating".
   
   People who take security seriously always call the authentication
inherent in an established TCP connection "weak authentication"; but
in fact it represents a pretty-darn-good correlation. Thus blacklists
based on IP address alone have an excellent correlation to sending
SMTP clients which have, at some time, sent abusive email. (Their
problems lie elsewhere.)

   OTOH we have schemes running which don't claim correlation much
above 60%, and offer no assurance the correlation will remain that
high. These, IMHO, don't qualify as "useful authentication", but
it's hard to argue they fail to be "weak authentication".

Once we have *that*, we will have (in our reputation systems) a
built in "market" for additional systems for (perhaps less weakly)
authenticating the desirability (not necessarily solely due to the
source) of incoming messages.


   I don't agree with Nat here.

   As a practical matter, _many_ folks will prefer sorting through
100 spams to losing one good email. I see darn little "market" for
anything which can't get it 99% right. What I think we're seeing is
folks that design a system for their own use, achieve an accuracy
of sorting sufficient for their needs, and offer it to others
because they see their marginal cost (per new customer) as
essentially zero.

   Further, until the "customer" abandons an existing method, the
barrier to adopting an additional method is pretty close to 99%
correct identification of email which passed the existing method(s).
This is _not_ an encouraging situation for entrepreneurs.

To some extent, there's a chicken-and-egg problem with
authentication and reputation technologies.  My hope for DKIM
is that it will give us one good enough egg to produce a chicken,
which can then (in much the manner that Cain and Abel found their
wives, I guess) facilitate a whole new generation of authentication
technology eggs.


   I find the challenge of designing a reputation system based in
DKIM a bit overwhelming. DKIM offers assurance that a domain has
"taken part in the transmission of an email message" containing
certain headers (which the recipient probably never sees), but no
assurance that anything else hasn't been changed since then.
There's no assurance that the message isn't a replay attack, nor
is there assurance that the original hasn't been lost. This is _not_ 
an attractive base upon which to build reputation.

When reputation is applied against an "authorization" as an
identifier, innocent email-address domain owners will be
seriously harmed. Abusers will find acceptance methods for an  
authorization scheme.


   Doug is complaining about the difficulty of designing a useful
reputation system on such a base. I entirely agree with him there.

   But I wish it to be clear I am not complaining about reputation  
services being out-of-scope in the DKIM charter. I prefer it that   
way: otherwise I'd be facing a serious challenge trying to cobble
on a useful reputation heuristic, with really no hope of meeting
the charter deadlines.
   
   I'm really not complaining at all: I'm just trying to bring
some sense of reality to what we should expect of DKIM-based
reputation systems.

Yes, every one of these schemes will be flawed.  That is why we
need to understand
A) the role of "weak authentication" (weeding out some but not
   all of the bad guys at any point in time, and using multiple
   sources of information to judge the desirability of a message)


   Expressed this vaguely, there's nothing to "understand". We'd
need useful estimates of what fraction of "bad guys" will be weeded
out and what fraction of "good guys" are (wrongly) weeded out. I'm
not sure any useful estimates of that can be found.

   and
B) the need for a continually evolving set of (ever-stronger,
   we hope) mechanisms for proving that a message is desirable
   to the recipient.

 
   This sounds like a research project. As a research project, I 
heartily favor it.

--
John Leslie <john(_at_)jlc(_dot_)net>

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

Re: The Value of Reputation (was Re: [ietf-dkim] Re: WG Review: Domain Keys Identified Mail (dkim))