Hi,
CC'ing tsvwg, which would be a better venue for this discussion.
On 2015-2-4, at 20:22, Phillip Hallam-Baker <phill(_at_)hallambaker(_dot_)com>
wrote:
Today most Web browsers attempt to optimize download of images etc. by
opening multiple TCP/IP streams at the same time. This is actually done for
two reasons, first to reduce load times and second to allow the browser to
optimize page layout by getting image sizes etc up front.
This approach first appeared round about 1994. I am not sure whether anyone
actually did a study to see if multiple TCP/IP streams are faster than one
but the approach has certainly stuck.
There have been many studies; for example,
http://www.aqualab.cs.northwestern.edu/publications/106-modeling-and-taming-parallel-tcp-on-the-wide-area-network
But looking at the problem from the perspective of the network it is really
hard to see why setting up five TCP/IP streams between the same endpoints
should provide any more bandwidth than one. If the narrow waist is observed,
then the only parts of the Internet that are taking note of the TCP part of
the packet are the end points. So having five streams should not provide any
more bandwidth than one unless the bandwidth bottleneck was at one or other
endpoint.
You don't get more bandwidth in stead state (well, with old Reno stacks, you
got a little more, but not much). The real win is in getting more bandwidth
during the first few RTTs of TCP slow-start, which is the crucial phase when
transmitting short web objects.
Now there are some parts of the deployed Internet that do actually perform
statefull inspection. But I would expect increasing the number of channels to
degrade performance at a firewall or any other middle boxen.
So we have a set of behavior that seems at odd with the theory. Has anyone
done any experiments recently that would show which is right?
I haven't seen any performance study, but another concern is that middleboxes
obviously need to maintain state per connection, and multiple parallel
connections eat that binding space up more quickly. (And for a NAT, reduce the
number of clients it can serve.)
The reason it makes a difference is that it is becoming clear that modern
applications are not best served by an application API that is limited to one
bi-directional stream. There are two possible ways to fix this situation. The
first is to build something on top of TCP/IP the second is to replace single
stream TCP with multi-stream.
SCTP has what you call multiple streams in your second option, and is designed
the same way.
My preference and gut instinct is that the first is the proper architectural
way to go regardless of the performance benefits. When Thompson and co were
arguing that all files are flat sequences of bits, they were saying that was
the right O/S abstraction because you could build anything you like on top.
But then I started to ask what the performance benefits to a multi-stream TCP
might be and I am pretty sure there should not be any. But the actual
Internet does not always behave like it appears it should.
See above.
Also, one motivation for SPDY/HTTP2.0 is to reduce the number of parallel
connections, since web people have noticed that more is not always better here.
I suspect that the preference for multiple streams probably comes from the
threading strategies it permits. But that is an argument about where the
boundary between the kernel and application is placed in the network stack
rather than where multiplex should live in the stack. Microsoft already
provides a network stack for .NET where the boundary is in the HTTP layer
after all.
So anyone got hard data they could share?
The TSVWG folks may have.
Lars
signature.asc
Description: Message signed with OpenPGP using GPGMail