I'm rather reticent to add real technical discussion to the issue of list
mismangement.
On Tue, 27 Sep 2005, Bill Sommerfeld wrote:
On Tue, 2005-09-27 at 10:06, Robert Elz wrote:
Date: Mon, 26 Sep 2005 15:41:56 -0400 (EDT)
From: Dean Anderson <dean(_at_)av8(_dot_)com>
Message-ID:
<Pine(_dot_)LNX(_dot_)4(_dot_)44(_dot_)0509261531270(_dot_)32513-100000(_at_)cirrus(_dot_)av8(_dot_)net>
| It is not DNSSEC that is broken.
I have not been following dnsop discussions, but from this summary, there
is nothing broken beyond your understanding of what is happening.
It's worse. The reasoning is broken on other points, as well.
In these arguments, RFC 1812 has been cited repeatedly as a
specification for load-splitting. By my reading, 1812 is extremely
vague about the topic, and does not require a specific spreading
algorithm.
Yes. It gives the implementor tremendous lattitude. But plainly, it is
appropriate to do (as Cisco did), per packet load balancing, where successive
packets can be expected to take different paths.
Its strongest recommendation is that there be a way to turn
it off if it doesn't work for you, which should by itself be a clue that
load-spreading should be used with caution; it also cautions that that
load-splitting was an area of active research at the time 1812 was
published.
And now there are implementations and users that use it.
But to make anycast work with TCP or large UDP and fragments, one needs to
guarantee that two successive packets (actually an entire session) uses exactly
the same path. No load balancing (or very course grained load balancing) is
required. The prescription given in RFC1546 needs to be changed:
RFC1546 page 5:
---------------------------------------------------
How UDP and TCP Use Anycasting
It is important to remember that anycasting is a stateless service.
An internetwork has no obligation to deliver two successive packets
sent to the same anycast address to the same host.
---------------------------------------------------
RFC1546 also gives a prescription for alterations to TCP so that TCP can work
with Anycast and with the condition on successive packets above. So far as I
know, no one has implemenated this prescription in a TCP stack.
Moreover, load-splitting which results in the sort of flow-shredding
which would disrupt multi-packet anycast exchanges also causes
significant difficulties for unicast. To quote from rfc2991 section 2:
RFC2991 is a Informational, and is wrong in some of its assertions. This was
discussed on the GROW list.
Variable Path MTU
Since each of the redundant paths may have a different MTU,
this means that the overall path MTU can change on a packet-
by-packet basis, negating the usefulness of path MTU discovery.
This is not a real problem. The MTU is reduced to the smallest MTU of any path.
If PMTUD is turned off (an option rarely used) the DF bit is also turned off and
so packets will be fragmented. While the smaller packet size might be
sub-optimal on the larger MTU paths, this is just a (tiny) performance
consideration.
It is not the case that the usefulness of path MTU is negated.
Variable Latencies
Since each of the redundant paths may have a different latency
involved, having packets take separate paths can cause packets
to always arrive out of order, increasing delivery latency and
buffering requirements.
Packet reordering causes TCP to believe that loss has taken
place when packets with higher sequence numbers arrive before
an earlier one. When three or more packets are received before
a "late" packet, TCP enters a mode called "fast-retransmit" [6]
which consumes extra bandwidth (which could potentially cause
more loss, decreasing throughput) as it attempts to
unnecessarily retransmit the delayed packet(s). Hence,
reordering can be detrimental to network performance.
RFC2991 also mis-states the TCP issue. RFC2581 describes the Fast retransmit
behavior as follows:
"The TCP sender SHOULD use the "fast retransmit" algorithm to detect
and repair loss, based on incoming duplicate ACKs. The fast
retransmit algorithm uses the arrival of 3 duplicate ACKs (4
identical ACKs without the arrival of any other intervening packets)
as an indication that a segment has been lost. After receiving 3
duplicate ACKs, TCP performs a retransmission of what appears to be
the missing segment, without waiting for the retransmission timer to
expire.
RFC2991 mis-states this as follows:
When three or more packets are received before
a "late" packet, TCP enters a mode called "fast-retransmit"
This is not the case. [However, if it were the case, it would still only affect
6% of the packets.] A fast retransmit is made after 4 idential ack packets are
received, which means that 4 packets have to be received before the late packet.
A more thorough reading of RFC2581 reveals when an ACK should be sent:
A TCP receiver SHOULD send an immediate duplicate ACK when an out-
of-order segment arrives. The purpose of this ACK is to inform the
sender that a segment was received out-of-order and which sequence
number is expected. From the sender's perspective, duplicate ACKs
can be caused by a number of network problems. First, they can be
caused by dropped segments. In this case, all segments after the
dropped segment will trigger duplicate ACKs. Second, duplicate ACKs
can be caused by the re-ordering of data segments by the network (not
a rare event along some network paths [Pax97]).
While out-of-order packets could trigger the fast retransmit, it occurs
just 3% of the time. So just 3% of packets are unnecessarilly
retransmitted. Not a great performance impact.
But again, at worst, this is merely a performance issue that may be more than
compensated for by the additional performance and availability of multiple
diverse links.
But lets not forget the benefits of load balancing over diverse paths:
For example, when a path fails, it can be immediately removed from the routers
FIB, and another path can be immediately used without waiting for routing
processes to select the next best route and add it to the FIB. [no more
blackholes until next BGP scan after link failure]. While little benefit to
SMTP, This greatly benefits VOIP and streaming audio and video.
VOIP RTP buffers have no such performance issues with multipath. As long
as each packet arrives before it is to be consumed, it does not matter
what order they arrive in. PPLB would greatly improve VOIP performance
characteristics.
And folks I know who build gear which does load-splitting seem to be
scrupulously careful to avoid these sorts of problems.
The equipment cannot do anything to avoid these problems. Except turn off load
balancing if necessary.
--Dean
--
Av8 Internet Prepared to pay a premium for better service?
www.av8.net faster, more reliable, better service
617 344 9000
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf