[ietf-clear] more on no callbacks, please

From: John Levine
Sent: Monday, October 04, 2004 11:43 PM


<...>

We don't understand the scaling issues of any callback scheme, because
nobody's ever tried one on a large scale.  Having done my PhD thesis
on databases, I have a reasonably good idea what's involved in
building a database that has the very high update rate that a callback
database needs, and it's a hard problem, since update performance
scales much worse than linearly and is hard to parallelize.  The size
of each datum isn't important here, it's the number of updates and the
number of data sources (like the hundreds or thousands of outbound
MTAs that AOL or Yahoo have.)


I understand that there has been plenty of overstatement on both sides of
this, and you have a number of valid concerns that we will have to address.
The database issue is one that is very serious.  Though I think we had some
approaches that might have worked, I did _not_ do a PhD thesis on databases
so I wouldn't bet my life on it.  What we have done instead is to sidestep
this problem by dropping unique message ID's and instead including a SHA1
digest of the canonicalized message in the signature.  The message digest is
signed by an HMAC and the signature is verified by callback.  The recipient
defeats replay by verifying the digest in the signed return-path (or header,
if you prefer the data phase version).  Replay of a return-path with
anything but the original message will not validate because the digest won't
match, so the sender has no need to track the number of validations of any
particular return-path.

This style of signed return-path might look like:

S=HHHHHHHHHHHT(_dot_)DDDDDDDDDDDd=local-part(_at_)domain-part

where

   HHHHHHHHHHH = HMAC-SHA1( T(_dot_)DDDDDDDDDDDd=local-part(_at_)domain-part )

             T = timestamp

   DDDDDDDDDDD = SHA1 digest of canonicalized message

             d = callback descriptor determining callback method

To save space, d and T are only a few bits each and HHHHHHHHHHHT and
DDDDDDDDDDDd are encoded packed bit-strings.

The canonicalization details are in a header and we plan to use the same
canonicalization types as DK, since that has presumably been tested.  If
you're willing to give up before-data validation, you can shorten the
return-path and put the rest in a DK-like header where the SHA1 digest in
the header also covers the return-path.  This ties the signature in the
header to both the return-path and message, preventing replay.  The header
signature is then validated by callback, which validates both signatures if
the digest matches.  This header could be very similar to the DK signature
header and be attached to any desired identity:

MAIL FROM:<S=HHHHHHHHHHHT=local-part(_at_)domain-part>

SES-Signature: s=HHHHHHHHHHHHHHHHHH,d=DNS,D=DDDDDDDDDDDDDDDDDD,c=simple, ...

I know, I know, we need an I-D instead of all this hand-waving.  It's in the
works.

We also don't understand the attacks
that bad guys will try and the side effects that they'll cause.


What do we really know about the kind of attacks the bad guys will do on DK
or any of the other proposals?  None of them have been widely deployed so
they have not been real targets in the wild.  We're hypothesizing the
threats for all of them.


Maybe SES et al have solved all these problems, but maybe they're
going to run into the same problem that everyone else who's tried to
build such things have.  I would be surprised if other people with
database experience were any more sanguine about this problem.


I agree, and that's why we were happy to come up with this alternative that
has no need for such a database.


That's why C/R systems really truly need their own working group where
they can try their ideas out and get enough experience to make them
plausible.  I know that the proponents of SES think that they've
solved all the problems, but it's just not persuasive yet in view of
all of the history that says it's butting up against hard problems.


I not as familiar with them as you, but I think most C/R systems are very
different from this.  The ones I've seen use TCP mechanisms and amount to
automated whitelisting systems.  Those are a real PITA.  I don't see what
SES has in common with those designs.


So go prove me wrong, but you're going to have to build stuff to do it.


I agree that the burden of proof is on us and it reasonable to be skeptical.
We have built a number of systems with the old style signatures (unique ID
for every message) but none yet with the message digest.  That is being
built as we speak, and we would like to try out some attacks on it,
hopefully with some suggestions from you.

--

Seth Goodman