Re: Review of draft-ietf-idr-sla-exchange-10


Hi David,


One response inline ##svshah2


________________________________
From: Black, David <David(_dot_)Black(_at_)dell(_dot_)com>
Sent: Friday, February 24, 2017 8:53 AM
To: Shitanshu Shah; tsv-art(_at_)ietf(_dot_)org
Cc: idr(_at_)ietf(_dot_)org; 
draft-ietf-idr-sla-exchange(_dot_)all(_at_)ietf(_dot_)org; 
ietf(_at_)ietf(_dot_)org; Black, David
Subject: RE: Review of draft-ietf-idr-sla-exchange-10

David> Inline ...

Thanks, --David

From: Shitanshu Shah [mailto:shitanshu_shah(_at_)hotmail(_dot_)com]
Sent: Friday, February 24, 2017 4:09 AM
To: Black, David; tsv-art(_at_)ietf(_dot_)org
Cc: idr(_at_)ietf(_dot_)org; 
draft-ietf-idr-sla-exchange(_dot_)all(_at_)ietf(_dot_)org; 
ietf(_at_)ietf(_dot_)org
Subject: Re: Review of draft-ietf-idr-sla-exchange-10




Hi David,



Thank you for taking time for the review..



Please find our response inline ##svshah and [Med]


Regards,
Shitanshu
________________________________
From: David Black 
<david(_dot_)black(_at_)emc(_dot_)com<mailto:david(_dot_)black(_at_)emc(_dot_)com>>
Sent: Tuesday, February 21, 2017 3:31 PM
To: tsv-art(_at_)ietf(_dot_)org<mailto:tsv-art(_at_)ietf(_dot_)org>
Cc: idr(_at_)ietf(_dot_)org<mailto:idr(_at_)ietf(_dot_)org>; 
draft-ietf-idr-sla-exchange(_dot_)all(_at_)ietf(_dot_)org<mailto:draft-ietf-idr-sla-exchange(_dot_)all(_at_)ietf(_dot_)org>;
 ietf(_at_)ietf(_dot_)org<mailto:ietf(_at_)ietf(_dot_)org>
Subject: Review of draft-ietf-idr-sla-exchange-10

Reviewer: David Black
Review result: Not Ready

I've reviewed this document as part of the transport area
directorate's ongoing effort to review key IETF documents. These
comments were written primarily for the transport area directors, but
are copied to the document's authors for their information and to
allow them to address any issues raised. When done at the time of IETF
Last Call, the authors should consider this review together with any
other last-call comments they receive. Please always CC
tsv-art(_at_)ietf(_dot_)org<mailto:tsv-art(_at_)ietf(_dot_)org> if you reply to 
or forward this review.

Document: draft-ietf-idr-sla-10
Reviewer: David Black
Review Date: February 21, 2017

Review result: Not Ready

This is an early TSV-ART review of a working group draft, requested by
the IDR Working Group.

This draft defines an extension to BGP to allow exchange of traffic
handling parameters (e.g., configured rates, burst sizes, drop
thresholds).  While this is a useful area of technology to standardize
across network operators, this draft has significant problems, and
parts of it could use some serious rethought.

This reviewer has discussed small portions of this draft with some of
the authors in the past, but this is his first comprehensive reading
and review of the draft.

Major Issues:

[1] The draft is misnamed.  This is not an SLA (Service Level
Agreement) draft - it's a TCA (Traffic Conditioning Agreement) draft -
see the definition of TCA in RFC 2475.  This draft should start from
that definition and generalize the applicability of TCA beyond
Diffserv.  Large areas of SLA content are not covered by this draft -
for more details, see the Wikipedia article on SLA:
https://en.wikipedia.org/wiki/Service-level_agreement .

##svshah, sure, we can change the term to TCA (Traffic Conditioning Agreement), 
which fairly represents intention in the draft, if there is no otherwise 
comment from the working group.


David> That would be good, thanks.

[2] Section 3.3.2.1's reuse of the TSpec construct from RFC 2115 to
specify a token bucket is a good idea, but it's not a good idea to
respecify that construct in terms of L2 (link-layer, e.g., Ethernet)
octets, as the RFC 2115 TSpec is specified in terms of IP octets.
This change to specification in terms of L2 octets results in needing
the L2_OVERHEAD TLV to cope with the possible differences in L2 (link)
framing overhead at sender and receiver of this information.  The
TSpec should be respecified at the IP layer, with L2 framing overhead
left to the Producer and Consumer to factor into their calculations
based on each's direct knowledge of L2 functionality and configuration
in the local AS.  That ought to enable elimination of the L2_OVERHEAD
TLV, thereby reducing complexity.
##svshah, the purpose for advertising L2 overhead, from the Producer to the 
Consumer, is to make sure traffic is well conditioned, at the egress of the 
Consumer, to avoid possible indiscriminate drops at the ingress of the 
Producer. In another words, the Producer is telling the Consumer to ignore its 
own link level overhead, instead use Producer's provided link level overhead 
while running frames through QoS functions (this is relevant in use-cases where 
l2 overhead of the egress link on Consumer and of ingress link on the Producer 
are of different size, e.g., in the vpn/tunnel connection between two peer 
nodes which physically may be multiple hops away).
I understand your comments regarding re-using TSpec without any modification. 
Do you agree with the use-case of l2 differences? If yes, any suggestion how to 
accommodate that?

David> Specifying this in terms of IP octets implies that L2 overhead is to be 
ignored on both links.  That’s simpler, and a few sentences of discussion about 
possible L2 framing differences should cover what needs to be done (e.g., if 
traffic rates are configured in terms of L2 octets, then explain how each 
entity converts between IP traffic and L2 traffic - the draft already 
effectively contains that explanation).

##svshah2, just to make sure that some of the real use are not left out..

For the traffic conditioning where sla established is for the ip based rates 
only, it suffices for the Producer to send rates only IP based ignoring any 
specific of l2 overheads.

Though for the cases where physical links are not to be over-run, and/or, sla 
defined are based on l2 rates, the Consumer would need to know l2 overhead from 
the Producer. Otherwise 10 byes of l2 overhead difference on 64 bytes packets 
can be significant compare to on 1400 bytes packets, and thus the Producer does 
not have a way to convert rates to ip based (this can cause functional issue), 
since this has to be per packet consideration.

Regards,
Shitanshu




[3] A token bucket should have two rate-related marking parameters
based on its token fill rate, i.e., min-rate, not the four
rate-related marking parameters in this draft.  The max-rate
parameters in sections 3.3.2.5-6 ought to be specified against a
second token bucket.  In addition, the handling precedence algorithm
in section 3.3.2.7 is an overly complex way to specify the
relationship of two token buckets.  All of this is even more
important, because max-rate, as defined in RFC 2115, is only
applicable to bursting - that max-rate for bursting often turns out to
be an interface line rate, which is not generally useful for the
traffic provisioning purposes of this draft.

##svshah, what you describe below conceptually very much makes sense and that 
is what we attempt to achieve. What is unclear though how to capture that using 
TSpec definition specified in RFC2215. Since that TSpec definition has both 
minimum-rate and maximum-rate, but no in/out profile marking parameters.

Is your proposal below to define two TSpecs using the same definition from 
RFC2215? Wouldn't in that case we have min/max specified twice? min/max in each 
TSpec? and thus confusing to represent one min and one max through two TSpecs?

David> See RFC 2698 for the desired behavior and the parameters needed to 
specify it.  The current idr-sla draft is not able to represent the traffic 
conditioning mechanisms specified in both RFC 2698 and RFC 2697, as each 
mechanism requires two token buckets (e.g., there are two burst sizes, but the 
idr-sla draft TSpec only contains one).  This deficiency needs to be dealt 
with.  One possibility for simplification is to drop the max-rate from the RFC 
2115 TSpec and assume that the max-rate for bursting is the effective line rate.

The following should be done instead:
        - Define TSpecs for two token buckets, a primary/committed token bucket
                and a secondary/peak token bucket that MUST be nested, i.e., 
traffic
                that is in-profile for the secondary/peak token bucket is always
                in-profile for the primary/committed token bucket.  Some of the 
details
                of how to specify this are subtle, see RFC 2698 for a worked 
example.
                Use of a secondary/peak token bucket requires use of the 
primary/
                committed token bucket, but a primary/committed token bucket 
can be used
                without a secondary/peak token bucket.
        - For a single token bucket, define two handling TLVs, Committed (in 
profile) and
                Excess (out of profile).
        - For two token buckets, define three handling TLVs, Committed (in 
profile for both
                token buckets), Peak (out of profile for primary/committed 
token bucket, but in
                profile for secondary/peak token bucket) and Excess (out of 
profile for both
                token buckets).
NB: Could use Green/Yellow/Red terms instead of Committed/Peak/Excess
terms.

[4] The drop threshold TLV in section 3.3.2.8 is not specified
sufficiently to be implemented interoperably.  For example, I don't
understand what an implementation is supposed to do when it receives 3
drop thresholds.
##svshah, It is around the semantics of a single queue with a different 
threshold for a set of code-points, where packet for a specific code-point is 
to be tail dropped if overall queue-depth hits code-point specific threshold at 
the arrival of that packet.
We will add appropriate clarification for this semantics.

David> Will look at revision

[5] The relative priority TLV in section 3.3.2.9 has the same
insufficient specification problem as the drop threshold TLV,
compounded by a functional incompleteness problem - if the recipient
is using a weighted packet transmission scheduler (e.g., WRR),
priorities cannot be used to configure that scheduler.  Hence, some
specification of weights and scheduling algorithms that use weights
needs to be added.
##svshah, Sure. Is following clarification okay? (some of the wordings taken 
from RFC2598)

"
A higher priority class of traffic to be served without pre-empted by lower 
priority class of traffic for more than a packet time at the configured rate.

In the system that implements WRR, the use of relative priority may get 
restricted where a single queue may be used for a higher priority traffic class 
where that queue  is configured for the full share of the output bandwidth.
"

David> Last sentence basically says that WRR weights are out-of-scope.  That 
seems wrong.  Will look at revision.

[6] I have no idea what the sub-traffic classes TLV in section
3.3.2.10 is supposed to do, as that TLV is specified based on "Traffic
Class TLVs" which is an undefined term in this draft (e.g., that term
is not used outside of section 3.3.2.10.
##svshah, They are to facilitate hierarchy. A specific Traffic Class, can 
further be divided in a multiple  subset of Traffic Classes, with their own 
Traffic Class Elements and Traffic Class Services.

Will correct the wording to remove your this specific concern.

David> Will look at revision

[7] This draft's QoS contents need to be functionally aligned with
work-in-progress on YANG QoS models, in order to provide some
assurance that that this draft is implementable for actual network
switch/router data paths.  The current acknowledgement of the
existence of YANG, NETCONF and RESTConf at the end of Section 1 does
not suffice.
##svshah, While eventually "Traffic Conditioning Agreement" should be 
translated to the actual forwarding qos policy on any vendor specific device, 
TCA exchange largely carries concepts/semantics that is either standard based 
or well understood.

While we are not aware of any working group document of QoS Yang Model, we 
think that  concepts of Traffic Conditioning should be easily adaptable to any 
QoS Yang Models since they also have to be defined to support those concepts.


David> Still want to see cross-check with implementations here.


[8] There are significant complexity and correctness problems caused
by the option to not specify the Source AS - e.g., Section 3.2 defines
an SLA ID as an "identifier which is unique in the scope of Source AS"
which is meaningless if there is no Source AS.  It would be simpler to
always specify Source AS, even in the point-to-point case.

##svshah, okay. Ron Bonica also suggested same in his review earlier. We had 
chosen a path of optional Source AS to simplify implementations in cases. Given 
specification of Source AS always is more clearer in multiple feed-back we have 
gotten so far, we can modify the specification to incorporate that over the 
simplicity of implementation for certain cases.

David> OK, thanks.

[9] In section 3.2, the "intended for the peer receiver of the BGP
UPDATE message" text in the specification of bit 0 of the SLA Subtype
flags is unclear.  I suspect that this is intended to differentiate
the two usages described in sections 4.1.1 (Point-to-Point) and 4.1.2
(Multiple Hops), in which case the parenthesized terms (or similar
terminology) should be used with cross-references to those two
sections.
##svshah, sure will make necessary changes, also in the context of earlier 
comment.


[10] There needs to be a coherent discussion in one place about how
SLA advertisement, update and withdrawal work.  A single ADVERTISE
method may suffice on the wire, but the details on how initial
advertisement, subsequent advertisement (update) and withdrawal work
need to be specified in one place.  The third paragraph of Section 4
is a start on this material, but it's too terse;

[Med] That text is terse because we are reusing the BGP machinery for 
advertising/withdrawing; we only augment it with triggers that are linked to 
the SLA information. A new SLA will lead to an update message with ADVERTISE, 
an update of an existing SLA will trigger an update message. We do think this 
information is already present in the document.

David>This is hard to understand it, as there is no statement that the BGP 
machinery is being used, and the material on SLA usage is spread in different 
places.  I suggest starting section 3 with a discussion of 
advertisement/update/withdrawal, and how all three are realized via a single 
ADVERTISE method via BGP UPDATE - this was difficult to figure out in reading 
the current draft.

it should be expanded
into its own subsection, and moved earlier to come before the
ADVERTISE method in Section 3.2.  This text from Section 3.2 should be
moved into that new subsection and likewise expanded:

[Med] Another option is to move it Section 4. From our standpoint, we do think 
it is straightforward to describe the behavior once the attributes format are 
defined.

David> Ok, that’s an editorial choice - I prefer to explain how something works 
before defining its on-the-wire data format.

      If an advertised SLA ID is different from earlier advertised one,
      for the same prefix and from the same Source AS, indicates Source
      AS is advertising new SLA Content to replace the previous one
      advertised with the same SLA ID.

In addition, I wonder whether functionality should be added to allow
withdrawal of an advertisement by specifying its SLA ID, although that
was not part of the original design.


[Med] The reason we didn’t adopted that design is that we don’t want to make 
assumptions on which data the remote peer will be used for enforcing local 
actions and also because of this text:

      The SLA ID applies to aggregate traffic to prefixes for a given

      AFI/SAFI that share the same Source AS and SLA ID.

David> That’s fine, I wanted to ensure that this had been considered.  The 
discussion of BGP mechanism reuse will probably help here.

[11] Notions of context for interpretation of all the IPFIX parameters
in 3.3.1 need to be added, e.g.:
        - The first three parameters (DSCP, MPLS EXP field in top label, 802.1q 
priority) can
                and do vary on a link-by-link or LSP-by-LSP basis along a 
traffic's network path.
        - The IP address parameters are rather likely to be VPN-specific when 
there's more
                than one BGP/MPLS VPN that spans or transits the ASs involved.
        - The transport port parameters need specification of which transport 
header and where it
                is located (e.g., for TCP traffic carried by in VXLAN, is this 
the inner TCP header
                or the outer UDP header in VXLAN).
There are probably simple approaches to specifying context in all
cases, but that context does need to be specified ... in all cases.

[Med] We dind’t include a context field because we thought that the definition 
of the IPFIX attributes is sufficient by itself, but if you do think it is 
helpful to have such information, we can update the table.

David>  Really???  Please reread the first comment above:

        - The first three parameters (DSCP, MPLS EXP field in top label, 802.1q 
priority) can
                and do vary on a link-by-link or LSP-by-LSP basis along a 
traffic's network path.

David>  Taking just DSCP, it can change on a per-link basis, so something has 
to specify which link (between the SLA Producer and SLA Consumer?) is involved. 
 It probably suffices to say that it’s the link that crosses the relevant AS 
boundary, but that does have to be stated, as it’s nowhere to be found in the 
far more general IPFIX registry.

[12] The security considerations (section 10) are severely incomplete
and insufficient:

David> I stand by this statement and recommend a complete rewrite of the 
security considerations.

        - Discussion of possible abuse of this BGP option for denial-of-service 
and theft-of-service,
                needs to be added, including possible countermeasures and 
mitigations.

[Med] This attack vector is not specific to this attribute; this is valid for 
BGP in general.

David> There are additional attacks enabled by this attribute.

        - This sentence at the end of the second paragraph in Section 10 is 
content-free:

                   It is NOT RECOMMENDED to enable this attribute at the
                   scale of the Internet unless if means to prevent leaking 
sensitive
                   information are enforced.

                What exactly is an implementer or admin supposed to do?

[Med] An example of what an admin needs to check if this information can be 
sent to a given peer or not. Some of the content of the attribute contains some 
sensitive information such as contract Id, classes, etc.

How does one
                figure out whether an implementation or deployment is  "at the 
scale
                of the Internet" ??


[Med] The idea here is to avoid leaking such data in to the global routing 
table. Only entitled peers need receive such information with controlled scope.

David> I understand the idea - the problem is that I don’t see specification of 
what has to be implemented in either the text in the draft or the above 
explanations.

        - The next to last paragraph in section 10 is almost content-free, as 
it leaves
                decisions on implementation and deployment of key security 
functionality as
                "an exercise for the reader" - that's not acceptable.



[Med] I guess you are referring to the following text. Validation checks are 
really deployment-specific. The text calls out the issue and recommends an 
action; how to translate this action into detailed actions is realty local to a 
domain

   The attribute may be advertised by a misbehaving node to communicate

   SLA parameters that are not aligned with the SLA agreements.  Though

   the enforcement of SLA parameters is outside the scope of this

   document, it is RECOMMENDED that the SLA Consumer to enforce a set of

   validation checks before translating the SLA parameters conveyed in

   the QoS attributes into provisioning actions.  Such validations MAY

   rely on SLA parameters like the origin AS or SLA ID, like generating

   SLA ID using pseudo-random schemes 
[RFC4086<https://tools.ietf.org/html/rfc4086>].

David> I stand by the “exercise to the reader”  criticism - one way to be 
honest about this would be to change the paragraph to:



        The attribute may be advertised by a misbehaving node to communicate

   SLA parameters that are not aligned with the SLA agreements.  The

   enforcement of SLA parameters is outside the scope of this document.



David> That does not change any normative content of the original paragraph.  
My view is that the “outside the scope of this document” statement is wrong 
because all of these parameters are being defined in this draft.

        - Last, but not least, the final paragraph in Section 10 is a joke that 
will
                be lost on a security directorate reviewer - to understand why, 
see
                Section 2 of RFC 6919, and take note of the publication date of 
RFC 6919.

David> For anyone who didn’t catch the reference, RFC 6919 is an April 1 RFC 
... and the “SHOULD BE considered” language used in the last security 
considerations paragraph of this draft is accurately characterized as vague in 
Section 2 of RFC 6919:
   The phrase "SHOULD CONSIDER" indicates that the authors of the
   specification think that implementations should do something, but
   they're not sure quite what.
------------------------

Having noted a dozen major issues, I'll end the review here for now,
as I believe a serious revision of the draft is called for, which
would be a better starting point to review for minor issues and
editorial items.

--------------------------------------------------------
David L. Black, Distinguished Engineer
Dell EMC, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953     Cell: +1 (978) 394-7754
David(_dot_)Black(_at_)dell(_dot_)com<mailto:David(_dot_)Black(_at_)dell(_dot_)com>
  <=== NEW ===
--------------------------------------------------------