On Sat, 30 Sep 2006 22:47:03 +0200, Iljitsch van Beijnum
<iljitsch(_at_)muada(_dot_)com> wrote:
Since my earlier comments to the author were apparently ignored:
"Ignored" is not the right word; I just don't happen to agree with you...
I replied to you, both privately and on the NANOG list. I said:
There are lots of good solutions if you're willing to change or
introduce protocols. That takes a lot longer, both procedurally
and technically. This scheme is simple and single-ended, and can be
implemented without co-ordination.
We should indeed try for a better solution. Until then, I'm
suggesting this -- I'm aiming at Informational -- to tide us
over. The need for some such solution was quite clear during
Bonica's talk in San Jose.
...
The problems start when BOTH sides implement the new mechanism. In
that case, new keys will remain unused for some time, and then become
active at some hard-to-determine time in the future. (Neither side
knows for sure when the other side will switch to the new key.) This
means that there will be a problem in the case where the new key
isn't present on both sides, for instance because one side wasn't
configured with the new key in a timely fashion, despite out-of-band
agreement to do so, the keys configured on both sides don't match.
In this case, one side will start using its new key at some point in
time. If the other side doesn't have the same key, it can't validate
the TCP segment so the segment is dropped. In theory, it's possible
to recover from this condition by adding logic that observes the TCP
state, but I don't see how this can be made fully reliable,
especially given the wide variety of TCP implementations and other
environmental factors such as BGP (in)activity, packet loss and
reduced response times because of high CPU loads.
There is explicit text in 2.2 discussing the need for integration with the
TCP timeout mechanism.
So in a good number of cases, TCP segments remain unvalidated for too
long and the BGP session breaks. The really bad part is that this
happens at some unpredictable interval AFTER operator action, so
operator error doesn't create any usable feedback. Today, feedback is
immediate and conclusive. So the new situation is vastly inferior
from an operational robustness perspective.
Text I added after our exchange of messages in June discusses the need for
a MIB entry, precisely to address this issue.
This problem can easily be fixed by adding a BGP capability code and
a new BGP message. The capability code would indicate support for the
new message, and the new message would be used by each BGP speaker to
communicate the availability of a new key, along with a hash over the
key so the BGP speakers know at which point the other side has the
new key available, and that the new key is indeed the same as the
locally configured one. These types of additions to the BGP protocol
are well-understood and shouldn't lead to significant additional
implementation difficulty.
Again, for that to have any utility it requires double-ended changes.
(What follows isn't specific to the draft under consideration and
shouldn't be taken as input on how to change this particular draft.)
As long as I'm taking up bandwidth, let me address a more fundamental
problem with this draft and several others addressing the same or
similar issues. (It would be nice by the way to have a single venue
where all of this is discussed, in Montreal the discussion moved from
working group to working group and was therefore extremely hard to
follow for everyone who didn't make an express effort to do so.)
The real problem is agreeing to a key with people from another AS.
It's not uncommon for network operations staff for two ASes to reside
in different timezones, to speak different languages and to have
wildly dissimilar operational mores. This makes seemingly trivial
tasks such as finding a person who can agree to a key and finding a
secure channel to communicate the key very hard. The particular issue
that this draft addresses, which is agreeing on a time when the keys
are changed, is indeed also an issue but in my experience, it's not
the most problematic one in practice. The reason for this is that in
practice, keys are rarely changed after they've been set up
initially. I estimate that I've done some 200 inter-AS-session-years
worth of BGP operational management, and I can't remember ever having
been asked to change an existing BGP TCP MD5 password. The assumption
that these keys are so sensitive that they must be changed regularly
simply doesn't hold in practice.
But suppose that the keys must indeed be changed often. The problems
unrelated to the actual time of the change remain unaddressed here.
This is also true of the other proposals that I'm aware of, which
address other problems such as the weakness of the MD5 hash and the
way in which it's used here. In order for network operations to be
able to change the actual session keys often, it's necessary to base
the actual session keys (and preferably, the keying information
configured on a router) without the need to agree to any specific
keying information out-of-band. This probably involves some kind of
public key encryption, where a session is not configured with an
actual secret key, but with a fingerprint for a certificate held by
the remote router, or, better yet, the remote AS.
I'm afraid I do believe you about the rarity of password changes -- that's
precisely what I'm trying to fix! The need for such changes is discussed
in RFC 3562.
I agree with you completely about the need for better key management. At
the request of one of the Security ADs, I added Section 4 to this
document, stating this explicitly and giving some guidance on how to
proceed. I'm also working with people from the Transport Area on this
precise question.
--Steven M. Bellovin, http://www.cs.columbia.edu/~smb
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf