ietf
[Top] [All Lists]

RE: tsv-dir review of draft-ietf-vrrp-unified-spec-02.txt

2008-11-04 13:12:01
Hi Mark, 

I've started several threads to cover the more substantive issues -
these email threads are on VRRP list and not reproduced here. 

Please see inline for the less substantive issues. 

Thank you again for your detailed and thoughtful feedback. 

Regards,
Steve 
 
-----Original Message-----
From: ietf-bounces(_at_)ietf(_dot_)org 
[mailto:ietf-bounces(_at_)ietf(_dot_)org] On 
Behalf Of Mark Handley
Sent: Thursday, October 30, 2008 08:58
To: ietf(_at_)ietf(_dot_)org; TSV Dir; vrrp(_at_)ietf(_dot_)org
Subject: tsv-dir review of draft-ietf-vrrp-unified-spec-02.txt

I've reviewed this document as part of the transport area 
directorate's ongoing effort to review key IETF documents. 
These comments were written primarily for the transport area 
directors, but are copied to the document's authors for their 
information and to allow them to address any issues raised. 
The authors should consider this review together with any 
other last-call comments they receive.
Please always CC tsv-dir(_at_)ietf(_dot_)org if you reply to or forward 
this review.

Generally the draft reads very well, and is clear and comprehensible.
I had no difficulty understanding the draft overall, although 
occasionally terms were used before they were properly 
defined.  This isn't really a big deal, but I've highlighted 
the below.

From a transport area point of view, the main things we're 
looking for are whether the protocol will be well-behaved, 
especially from the point of view of congestion control.  
VRRP does not perform any form of congestion control, but as 
it is purely a link-local protocol, configured by a network 
operator to provide redundancy between two routers on the 
same LAN segment, this is not really an issue.  One presumes 
that a network operator choosing sub-second advertisement 
intervals knows what he or she is doing, and knows 
appropriate rates for their local circumstances.  Routers 
provide many ways to shoot yourself in the foot, and this one 
doesn't seem worth of concern.

However, the draft does have a weakness when it comes to congestion.
It does not provide any guidance to router vendors as to 
whether VRRP packets should receive priority treatment when 
being transmitted.  The potentially problematic situation 
occurs when a router is delivering more packets onto the LAN 
than can be accomodated, and so a queue builds up in the 
router.  Typical default queuing delays tend to be some 
generic wide-area RTT (so that a delay-bandwidth product of 
packets can be queued).  Thus packets being transmitted onto 
the VRRP-protected LAN could see perhaps 100ms or more of 
queuing delay.
If VRRP packets enter such a queue, and the smallest VRRP 
Advertisement_Interval is configured, the 
Master_Down_Interval will be between 30 and 40ms.  Thus 
normal queuing delays might cause a VRRP backup to conclude 
that the master is down, and therefore promote itself to 
master.  Very shortly afterwards, the delayed VRRP packets 
from the master would arrive, causing a switch back to backup status.
However this process can repeat many times per second, 
causing significant disruption to traffic.

My feeling is that, if possible, VRRP packets should be 
priority-forwarded onto the LAN, mitigating this problem.  
However, it's not clear this is always possible.  At the 
least, the draft ought to comment on this possible scenario 
and the risks of very low Advertisement_Interval values in 
the presence of congestion.

It would presumably be possible for a VRRP master to observe 
such a situation is occurring frequently.  Under such 
circumstances, at the least a good implementation should log 
that there is a problem.  It might also be possible to 
specify that the master should automatically backoff its 
Advertisement_Interval value when it observes such thrashing. 
 I'll leave it to the VRRP authors to think over whether that 
might be desirable or might have unintended consequences.  I 
think the draft can be progressed without any such adaptive 
mechanism, but the authors may wish to think about it anyway, 
as it might improve VRRP's robustness.


See separate thread 


Detailed comments on the spec (many of these are nit-picking, 
but a few might be substantive):

Section 1.2, para 4:  "virtual router's IPv4 addresses"   - at this
point in reading the draft, it wasn't clear to me whether 
this was any of the addresses of any of the routers 
comprising the virtual router or whether it was the address 
being virtualized.  This became clear later on in the draft.


Will revisit & try to clarify

Section 1.3, para 4:  "could be speeded up"  - this is 
awkward English; I'd have written "could be sped up" or 
"could be made quicker"

Ok 


Section 2.1, para 1: "Backup of an IPvX address(es)" - awkward English
- better would be "Backup of an IPvX address or addresses"

Ok 


Section 2.1, bullet 4 "Provide for election ... load 
balancing".  It wasn't really clear to me what was meant by this.


See separate thread 

Section 2.4, para 1:  "Sending either IPvX packets on..."  It 
wasn't clear what this meant - do you mean "Sending either 
IPv4 or IPv6 packets"?


Will revisit & consider clarifications

Section 2.5 - is the internet draft referred to now dead?  If 
so, this reference should probably be deleted, and the 
sentence rephrased appropriately.


ok

Section 3,  para 1:  "Each VRRP virtual router has a single 
well-known MAC address allocated to it."  I understand what 
this means, but I wonder if this should be scoped to indicate 
it's true per link?
There's some potential for misunderstanding that the same MAC 
address must be used for the same VRID on other interfaces of 
the same router, which I don't think is intended.


Will revisit & consider clarifications

Section 3, para 2: I found this whole paragraph somewhat confusing.
It's trying to say too many things in too few sentences.


Will revisit & consider clarifications

Section 5.2.9, para 2:  "associated with" is a rather vague term.
Maybe "being backed up by" or something similar would be clearer?


Will revisit & consider clarifications

Section 6.1.  Should the VR MAC address be a parameter?  I 
think it should.

I think the combination of IPv4 or IPv6 and the VRID forces the VR MAC. 


Section 6.1: "Accept_Mode"  - the explanation here could be clearer.
An example would help.

Will revisit & consider clarifications


Section 6.4.1 - a "Startup event" is not defined.  I can 
guess was is meant, but it might be better to define it.


Will revisit & consider clarifications

Section 6.4.2 "Broadcast gratuitous ARP request"  should 
probably say "Broadcast gratuitous ARP request on that interface".


Ok 

Section 6.4.3 .  "Note: IPv6 Neighbor Solicitations and 
Neighbor Advertisements  should not be dropped when 
Accept_Mode is False."
This would probably be better being stated a few lines 
earlier, in the two MUST bullets.  In any event, it seems 
like it's a MUST not a SHOULD.


Will revisit & consider clarifications

Section 6.4.3, top of page 27.  I think the actions are incorrect.
They are currently:

               @ Cancel Adver_Timer

               @ Recompute the Master_Down_Interval

               @ Set Master_Adver_Interval to Adver Interval contained
               in the ADVERTISEMENT

               @ Set Master_Down_Timer to Master_Down_Interval

               @ Transition to the {Backup} state

I think they should be:

               @ Cancel Adver_Timer

               @ Set Master_Adver_Interval to Adver Interval contained
               in the ADVERTISEMENT

               @ Recompute the Skew_Time

               @ Recompute the Master_Down_Interval

               @ Set Master_Down_Timer to Master_Down_Interval

               @ Transition to the {Backup} state

Ie, it should explicitly say recompute Skew_Time, but more 
importantly, the Master_Adver_Internal needs to be set before 
computing Master_Down_Interval, otherwise you use the old 
(obsolete) value.


See separate thread 

Section 7.1:

- MAY verify that "Count IPvX Addrs" and the list of IPvX Address
      matches the IPvX Address(es) configured for the VRID

   If the above check fails, the receiver SHOULD log the event and MAY
   indicate via network management that a misconfiguration 
was detected.
   If the packet was not generated by the address owner (Priority does
   not equal 255 (decimal)), the receiver MUST drop the packet,
   otherwise continue processing.

This combination of MAY, SHOULD and MUST is confusing.  Also 
it's not clear if the second "If" is conditional on the first 
"If".  It seems like you may have a mandatory action that you 
can't do unless you do an optional action.


Will revisit & consider clarifications

Section 7.4, para 3:

   An implementation might choose simplify this for the operator by
   using the VRRP MAC in the formation of these link local addresses.

Should be "might" be a MAY?

Will revisit & consider clarifications


"VRRP MAC" is not a defined term.  I think the right term is 
"Virtual Router MAC Address".

This whole paragraph seemed a little vague.


Will fix term and will revisit & consider clarifications

Section 8.1.2:

   "When a VRRP router restarts or boots, it SHOULD not send any ARP
   messages with its physical MAC address for the IPv4 
address it owns,
   it should only send ARP messages that include Virtual MAC 
addresses."

How do you ssh to the physical router, if you're not sure 
which router you'll actually reach?  Does this require a 
separate IPv4 address?

      "When configuring an interface, VRRP routers should broadcast a
      gratuitous ARP request containing the virtual router MAC address
      for each IPv4 address on that interface."

Surely a VRRP router only does this when becoming the master?
Otherwise backup routers can cause traffic to be blackholed 
when their interface is configured.  Similar text appears in 8.2.2.


See separate thread

Section 9.1.  I don't know much about FDDI.  If you do as 
described, I assume there's some way to avoid the packet 
circulating forever?


See separate thread 


To summarize:  I think the draft has a few issues that need 
to be addressed before publication.  Most of the issues are 
pretty minor; it's up to the IESG as to whether addressing 
them would require a new last call, but my expection is that 
it would not.

That's all.  I hope this is of some help to you.

I'm sure this will improve the draft. 


Cheers,
Mark
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>