ietf
[Top] [All Lists]

RE: Gen-ART review of draft-ietf-dime-overload-reqs-10

2013-08-22 14:51:41
Hi Eric,

This looks good - comments follow ...

a) I assume that overload control development work will derive more specific
security requirements - e.g., as REQ 27 is stated at a rather high level.
The discussion in security considerations section seems reasonable.

We agree with this.  The thinking here was that we didn't want to specify this
in a way that would be specific to a particular type of mechanism.  It might
not hurt to state that assumption, either as a note on Req 27 or in the sec
considerations.

That would be good to add as a note on REQ 27.

The intent was very much as you say, where requirements on individual node
capabilities are hoped to result in better overall system behaviors. There are
also some requirements that are stated more at the system level (e.g. 7 and
17.) Also the text in section 2.2 that discusses Figure 5 talks about how
insufficient server capacity at a cluster of servers behind a Diameter agent
can be treated as if the agent itself was overloaded.

On the other hand, any mechanism we design will have to focus on actions of
individual nodes, so the numbered requirements tend to focus on that. I'm not
sure where to change the balance here--do you have specific suggestions?

I noted this as editorial rather than a minor issue, as I was mostly concerned
that the actual design work will be informed by a sufficient architectural 
"clue"
that the goal is "better overall system behaviors", which your response 
indicates
will definitely be the case ;-).

Rather than edit individual requirements, how about adding the following 
sentence
immediately following the introductory sentence in Section 7?:

        These requirements are stated primarily in terms of individual node
        behavior to inform the design of the improved mechanism;
        that design effort should keep in mind that the overall goal is
        improved overall system behavior across all the nodes involved, 
        not just improved behavior from specific individual nodes.

This inadequacy may, in turn, contribute to broader congestion collapse

"collapse" is not the right word here - I suggest "issues", "impacts",
"effects" or "problems".

We are fine with any of those alternatives.  How about impacts.

That's fine.  FWIW, "congestion collapse" has a specific (rather severe)
meaning over in the Transport Area, and that meaning was not intended here.

23.843 is the least stable reference.  I don't have any issue with pointing
that out.  The part of it we are referencing is historical front matter
though.

I'd note the reference as work in progress, and put the statement about stable
front matter (historical is a bad work to use here) in the body of the draft
that cites the reference.
 
I tried the web and downloaded versions of 2.12.17 and was not able to get the
warnings you saw (about the references).  What did it say?

Sorry, I didn't mean to send you on a wild goose chase :-).  The idnits 
confusion
manifested right at the top of the output, where everyone ignores it ...

   Attempted to download rfc272 state...
   Failure fetching the file, proceeding without it.

You didn't reference RFC 272, so that output's apparently courtesy of idnits
misinterpreting this reference:

1195       [TS29.272]
1196                  3GPP, "Evolved Packet System (EPS); Mobility Management
1197                  Entity (MME) and Serving GPRS Support Node (SGSN) related
1198                  interfaces based on Diameter protocol", TS 29.272 11.4.0,
1199                  September 2012.

I was amused :-).

Thanks,
--David

-----Original Message-----
From: Eric McMurry [mailto:emcmurry(_at_)computer(_dot_)org]
Sent: Thursday, August 22, 2013 3:06 PM
To: Black, David
Cc: ben(_at_)nostrum(_dot_)com; General Area Review Team 
(gen-art(_at_)ietf(_dot_)org);
ietf(_at_)ietf(_dot_)org; dime(_at_)ietf(_dot_)org; 
bclaise(_at_)cisco(_dot_)com
Subject: Re: Gen-ART review of draft-ietf-dime-overload-reqs-10

Hi David,

Thank you for the review.  Your time and comments are appreciated!

comments/questions inline.


Eric



On Aug 17, 2013, at 9:18 , "Black, David" 
<david(_dot_)black(_at_)emc(_dot_)com> wrote:


I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments
you may receive.

Document: draft-ietf-dime-overload-reqs-10
Reviewer: David L. Black
Review Date: August 17, 2013
IETF LC End Date: August 16, 2013
IESG Telechat date: (if known)

Summary:
This draft is basically ready for publication, but has nits that should be
fixed before publication.

This draft describes scenarios in which Diameter overload can occur and 
provides
requirements for development of new overload control functionality in 
Diameter.
It is well written, and the inclusion of scenarios in which overload can 
occur,
both in terms of the relationships among types of Diameter nodes and actual 
mobile
network experience is very helpful.

I apologize for this review being a day late, as I've been on vacation for 
most
of this draft's IETF Last Call period.

Major issues: (none)

Minor issues: (none)

Nits/editorial comments:

The following two comments could be minor issues, but I'm going to treat 
them
as editorial, as I expect that they will be addressed in development of the
actual overload functionality:

a) I assume that overload control development work will derive more specific
security requirements - e.g., as REQ 27 is stated at a rather high level.
The discussion in security considerations section seems reasonable.

We agree with this.  The thinking here was that we didn't want to specify this
in a way that would be specific to a particular type of mechanism.  It might
not hurt to state that assumption, either as a note on Req 27 or in the sec
considerations.


b) The draft, and especially its requirements in Section 7 are strongly
focused on individual Diameter node overload.  That's necessary, but 
overload
conditions can be broader, affecting an entire service or application, or
multiple instances of either/both, even if not every individual Diameter 
node
involved is overloaded.  A number of the requirements, starting with REQ 22
could be generalized to cover broader overload conditions.

This (b) has implications for other requirements, e.g., REQ 13 should also 
be
generalized beyond a single node to avoid increased traffic in an overload
situation, even from a node that is not overloaded by itself.  There are 
limits
on what is reasonable here, as the desired overload functionality is 
TCP/SCTP-
like reaction to congestion where individual actions taken by nodes based on
the information they have (which is not the complete state of the network)
results in an overall reduction of load.

The intent was very much as you say, where requirements on individual node
capabilities are hoped to result in better overall system behaviors. There are
also some requirements that are stated more at the system level (e.g. 7 and
17.) Also the text in section 2.2 that discusses Figure 5 talks about how
insufficient server capacity at a cluster of servers behind a Diameter agent
can be treated as if the agent itself was overloaded.

On the other hand, any mechanism we design will have to focus on actions of
individual nodes, so the numbered requirements tend to focus on that. I'm not
sure where to change the balance here--do you have specific suggestions?


Section 1.2, 2nd paragraph:

  as network congestion, network congestion can reduce a Diameter nodes

"nodes" -> "node's"

good catch.


Section 5, 1st paragraph:

This inadequacy may, in turn, contribute to broader congestion collapse

"collapse" is not the right word here - I suggest "issues", "impacts",
"effects" or "problems".

We are fine with any of those alternatives.  How about impacts.


Section 7

The long enumerated list of requirements is not an easy read.  It would be
better if these could somehow be grouped by functional category, e.g.,
security, transport interactions, operational/administrative, etc.

agree.  It is actually in sections in the XML (denoted by comments), we just
did not promote those to visible sections in the txt.  I recall there being
some issue with xml2rfc and numbering, but now that the numbers are set, this
would not be hard to do.



idnits 2.12.17 noticed the non-standard RFC 2119 boilerplate - this is fine,
as the boilerplate has been appropriately modified for this draft that
expresses requirements (as opposed to a draft that specifies a protocol).

idnits 2.12.17 got confused by the 3GPP and GSMA Informative References.
I assume that they're all sufficiently stable to be informative references.
However, [TR23.843] is a work in progress, and should be noted as such in
its reference - is this needed for any of the other 3GPP or GSMA references?

23.843 is the least stable reference.  I don't have any issue with pointing
that out.  The part of it we are referencing is historical front matter
though.


I tried the web and downloaded versions of 2.12.17 and was not able to get the
warnings you saw (about the references).  What did it say?



Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com        Mobile: +1 (978) 394-7754
----------------------------------------------------




<Prev in Thread] Current Thread [Next in Thread>