Re: [aqm] Last Call: <draft-ietf-aqm-fq-codel-05.txt> (FlowQueue-Codel)

Toke,

Sorry for not yet sending the follow-up. Straight after that email, Igot roped into becoming a makeshift ambulance driver and then ... longstory...

Thanks for taking my comments constructively, as intended. Responsesembedded.


On 18/03/16 12:47, Toke Høiland-Jørgensen wrote:

Hi Bob

Thank you for your timely and constructive comments. Please see the
inline responses below.

My main concern is with applicability. In particular, the sentence in
section 7 on Deployment Status: "We believe it to be a safe default
and encourage people running Linux to turn it on: ...". and a similar
sentiment repeated in the conclusions. "and we believe it to be safe
to turn on by default, as has already happened in a number of Linux
distributions."

Can one of the authors explain why a solution with the limitations in
section 6 can still be described as "safe"?

"We believe it to be a safe default" means that we have not seen any of
the theoretical limitations we have documented in section 6 be a concern
*in practice* in any of the extensive number of deployments FQ-CoDel has
seen already. And that the benefits of turning on FQ-CoDel are
sufficient that nudging people in that direction is a good idea.

This is perhaps because "we" (ie the people looking) tend to havesignificantly more bandwidth than the majority of Internet users (thosein the developing world). When you have less bandwidth, long-runningflows last longer, so they tend to overlap more. Given bloat problemsare only seen intermittently in the first place [Hohlfield14], theaverage person isn't going to see these limitations very often. But ifyou are a homeworker using a VPN (for instance), you will be dogged bythese problems all the time.

So the main problem here is with the assumption that the test has to be"whether we observe these limitations in practice".

Few people observed problems with NATs at the time they were introduced(otherwise they wouldn't have sold successfully). So those arguingagainst them tended to be ignored by mainstream comms engineers. Butthen the "theoretical" limitations started to bite. And we ended uphaving to make do with a subset of the potential of the Internet. Thosesounding the warning bells could see the potential of the Internet, andthey could see how NATs would close that off. Those ignoring the warningbells believed they were right to only be concerned with the here and now.

My concern is about precluding future desirable developments inapplication behaviour. It will be rare to observe such cases by randominspection, they may not appear while using existing applications onexisting high speed links. But, they will occur very frequently inscenarios prone to them. That's often the nature of side-effects.

My concern is particularly about fq technology in the network precludingimprovements in the quality of regular best efforts service that we canexpect through changes in applications and transports alone.

When I was arguing against FQ_CoDel (back in 2013 at the latencyworkshop - you were there too), numerous people were saying thatFQ_CoDel is much more subtle than regular FQ. At which point I quieteneddown, because I trusted enough of those people. However, in the recenttests with HAS (criticised at length elsewhere), one thing that can besaid with certainty was that FQ_CoDel just becomes a regular fqscheduler when you have two or more long-running flows that can alwayskeep their queues from emptying. Whatever instantaneous rate theapplication tries to run at, FQ overrides it and runs at 1/N of thecapacity. That is not good for a video coming off a camera at a variableinformation rate. FQ skims off all the peaks, so the VBR codec adaptsdown to the worst-case peak rate, not the worst-case average rate.

Indeed, these sentences seem rather Orwellian.

I can assure you that we are not attempting to exert "draconian control
by propaganda, surveillance, misinformation, denial of truth, and
manipulation of the past" (quoting
https://en.wikipedia.org/wiki/Orwellian here). But thank you for
implying it :)

Well, stating the limitations in the draft, then denying their truth inthe conclusions by using the word safe to describe them is classicOrwellian Newspeak<https://en.wikipedia.org/wiki/Newspeak#To_control_thought>.

Would it not be correct instead to say that FQ_CoDel has been made the
default in a number of Linux distributions despite not being safe in
some circumstances?

At the time it was made the default in OpenWrt (several years ago now,
if memory serves me right), there was not a whole lot of real-world
deployment experience, due to the chicken-and-egg problem of not wanting
to change the default before we have gathered more experience. However,
today the situation is quite different, thanks in part to the boldness
of the OpenWrt devs. So no, I do not believe that to be the case any
longer.

The experience that led me to understand this problem was when a bunchof colleagues tried to set up a start-up (a few years ago now) to sell arange of "equitable quality" video codecs (ie constant quality variablebit-rate instead of constant bit-rate variable quality). Then, the firstISP they tried to sell to had WFQ in its Broadband remote accessservers. Even tho this was between users, not flows, when video was thedominant traffic, this overrode the benefits of their cool codecs (whichwould have delivered twice as many videos with the same quality over thesame capacity.

Now, by your test, you will never see the limitations these videossuffered. Because they never got developed. Because the developers gaveup. You can think of FQ_CoDel as nice well-meaning people (the Linuxcommunity) creating a new middlebox problem.

2. Default?

If a draft saying "We believe it to be a safe default..." is published as an
RFC, it means "The IETF/IESG/etc believes..."
Only one solution can be default, so if the IETF says that FQ_CoDel is a safe
default, and no other AQM RFC makes any claim to being a safe default (which
they do not at the moment), it could be read as the IETF recommending FQ_CoDel
for default status and, by implication, other AQMs (like PIE, say) are not
recommended for default status.

This is certainly not my reading. This is an experimental RFC saying "we
believe it to be safe as a default" not a standards track RFC saying
"this should be the default". This is an important difference; we are
not mandating anything, but rather expressing our honest opinion on
the applicability of FQ-CoDel as a default, should anyone wish to make
it one in their domain.

As far as I know, unlike the listed FQ_CoDel limitations, no
limitations of PIE have been identified. I don't think anyone is
claiming that the performance of FQ_CoDel is awesomely better than
PIE. May be a bit better, may be a bit worse, depending on
circumstances, and depending on which you value most out of low
queuing delay, high utilization, or low loss.

Well, for CoDel and PIE that is certainly true. But FQ-CoDel in many
cases reduces latency under load by an order of magnitude compared to
both of them, while improving throughput.

OK, I have seen such figures, and it makes sense that FQ will givesingle RTT flows v low latency.

My concern is that of course the IESG will want to sign off an RFC withthis cool performance, given they read that the limitations are notimportant. Whereas I believe the limitations have been downplayed.

So, if the authors want the IETF to recommend a default AQM on the
basis of safety (and I agree safety is the most important factor when
choosing a default), the most likely candidate would be PIE, wouldn't
it? FQ_CoDel has unintended side-effects, which implies it is not a
good candidate for default; it should only be configured deliberately
by those who can live with the side-effects.

I'm not sure it would be possible for the AQM group to agree on a
recommendation for a default. But I suppose it might be a good
bikeshedding exercise. And as noted above, this is not what we intend to
do in this case.

If we don't want the IETF (or the AQM WG) to make this call, we shouldmake it clear that we are not making this call.

My concern is that, years down the line, when the context has been lost,these sentences could be interpreted as making this call.For comparison, consider how we have been trying to understand whatRFC2309 (the RED manifesto) intended to say.

3. A Detail

I also have a concern about the way the limitations are written
(typically, each limitation is stated, followed by a arm-waving
qualification attempting to create an impression that there is not
really a limitation). To keep the thread clean, I'll send that in a
follow-up email.

It is certainly not our intention to "create an impression that there is
not really a limitation". Rather, we are trying to suggest ways in which
each limitation can be mitigated by people who are concerned about it,
but still want to realise the benefits of deploying FQ-CoDel. Sure, some
of those proposals are not exactly at the "running code" stage, but
dismissing them as arm-waving is hardly fair.

I'll add, as I noted initially, that many of the limitations we have
noted are of a theoretical nature (in the sense that we are not aware of
any deployments where they have caused issue in practice). This does not
make it any less important to document them, of course, and we have been
grateful for the feedback from the working group that the section grew
out of (you yourself were among the people providing this feedback, I
believe). However, this also means that it is difficult to do more than
point out each issue. We can't quantify them, for instance.

If you have concrete suggestions for language that would make things
clearer, do tell (though I suppose that's exactly what you'll do in your
follow-up mail). :)

See the next email (like I promised before).

Cheers


Bob

[Hohlfeld14] Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A. & Barford,P., "A QoE Perspective on Sizing Network Buffers," In: Proc. InternetMeasurement Conf (IMC'14) pp.333-346 ACM (November 2014)


-Toke


--
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

Re: [aqm] Last Call: <draft-ietf-aqm-fq-codel-05.txt> (FlowQueue-Codel) to Experimental RFC