ietf
[Top] [All Lists]

tsv-dir review of draft-ietf-vrrp-unified-spec-02.txt

2008-10-30 12:51:19
I've reviewed this document as part of the transport area
directorate's ongoing effort to review key IETF documents. These
comments were written primarily for the transport area directors, but
are copied to the document's authors for their information and to
allow them to address any issues raised. The authors should consider
this review together with any other last-call comments they receive.
Please always CC tsv-dir(_at_)ietf(_dot_)org if you reply to or forward this
review.

Generally the draft reads very well, and is clear and comprehensible.
I had no difficulty understanding the draft overall, although
occasionally terms were used before they were properly defined.  This
isn't really a big deal, but I've highlighted the below.

From a transport area point of view, the main things we're looking for
are whether the protocol will be well-behaved, especially from the
point of view of congestion control.  VRRP does not perform any form
of congestion control, but as it is purely a link-local protocol,
configured by a network operator to provide redundancy between two
routers on the same LAN segment, this is not really an issue.  One
presumes that a network operator choosing sub-second advertisement
intervals knows what he or she is doing, and knows appropriate rates
for their local circumstances.  Routers provide many ways to shoot
yourself in the foot, and this one doesn't seem worth of concern.

However, the draft does have a weakness when it comes to congestion.
It does not provide any guidance to router vendors as to whether VRRP
packets should receive priority treatment when being transmitted.  The
potentially problematic situation occurs when a router is delivering
more packets onto the LAN than can be accomodated, and so a queue
builds up in the router.  Typical default queuing delays tend to be
some generic wide-area RTT (so that a delay-bandwidth product of
packets can be queued).  Thus packets being transmitted onto the
VRRP-protected LAN could see perhaps 100ms or more of queuing delay.
If VRRP packets enter such a queue, and the smallest VRRP
Advertisement_Interval is configured, the Master_Down_Interval will be
between 30 and 40ms.  Thus normal queuing delays might cause a VRRP
backup to conclude that the master is down, and therefore promote
itself to master.  Very shortly afterwards, the delayed VRRP packets
from the master would arrive, causing a switch back to backup status.
However this process can repeat many times per second, causing
significant disruption to traffic.

My feeling is that, if possible, VRRP packets should be
priority-forwarded onto the LAN, mitigating this problem.  However,
it's not clear this is always possible.  At the least, the draft ought
to comment on this possible scenario and the risks of very low
Advertisement_Interval values in the presence of congestion.

It would presumably be possible for a VRRP master to observe such a
situation is occurring frequently.  Under such circumstances, at the
least a good implementation should log that there is a problem.  It
might also be possible to specify that the master should automatically
backoff its Advertisement_Interval value when it observes such
thrashing.  I'll leave it to the VRRP authors to think over whether
that might be desirable or might have unintended consequences.  I
think the draft can be progressed without any such adaptive mechanism,
but the authors may wish to think about it anyway, as it might improve
VRRP's robustness.


Detailed comments on the spec (many of these are nit-picking, but a
few might be substantive):

Section 1.2, para 4:  "virtual router's IPv4 addresses"   - at this
point in reading the draft, it wasn't clear to me whether this was any
of the addresses of any of the routers comprising the virtual router
or whether it was the address being virtualized.  This became clear
later on in the draft.

Section 1.3, para 4:  "could be speeded up"  - this is awkward
English; I'd have written "could be sped up" or "could be made
quicker"

Section 2.1, para 1: "Backup of an IPvX address(es)" - awkward English
- better would be "Backup of an IPvX address or addresses"

Section 2.1, bullet 4 "Provide for election ... load balancing".  It
wasn't really clear to me what was meant by this.

Section 2.4, para 1:  "Sending either IPvX packets on..."  It wasn't
clear what this meant - do you mean "Sending either IPv4 or IPv6
packets"?

Section 2.5 - is the internet draft referred to now dead?  If so, this
reference should probably be deleted, and the sentence rephrased
appropriately.

Section 3,  para 1:  "Each VRRP virtual router has a single well-known
MAC address allocated to it."  I understand what this means, but I
wonder if this should be scoped to indicate it's true per link?
There's some potential for misunderstanding that the same MAC address
must be used for the same VRID on other interfaces of the same router,
which I don't think is intended.

Section 3, para 2: I found this whole paragraph somewhat confusing.
It's trying to say too many things in too few sentences.

Section 5.2.9, para 2:  "associated with" is a rather vague term.
Maybe "being backed up by" or something similar would be clearer?

Section 6.1.  Should the VR MAC address be a parameter?  I think it should.

Section 6.1: "Accept_Mode"  - the explanation here could be clearer.
An example would help.

Section 6.4.1 - a "Startup event" is not defined.  I can guess was is
meant, but it might be better to define it.

Section 6.4.2 "Broadcast gratuitous ARP request"  should probably say
"Broadcast gratuitous ARP request on that interface".

Section 6.4.3 .  "Note: IPv6 Neighbor Solicitations and Neighbor
Advertisements  should not be dropped when Accept_Mode is False."
This would probably be better being stated a few lines earlier, in the
two MUST bullets.  In any event, it seems like it's a MUST not a
SHOULD.

Section 6.4.3, top of page 27.  I think the actions are incorrect.
They are currently:

               @ Cancel Adver_Timer

               @ Recompute the Master_Down_Interval

               @ Set Master_Adver_Interval to Adver Interval contained
               in the ADVERTISEMENT

               @ Set Master_Down_Timer to Master_Down_Interval

               @ Transition to the {Backup} state

I think they should be:

               @ Cancel Adver_Timer

               @ Set Master_Adver_Interval to Adver Interval contained
               in the ADVERTISEMENT

               @ Recompute the Skew_Time

               @ Recompute the Master_Down_Interval

               @ Set Master_Down_Timer to Master_Down_Interval

               @ Transition to the {Backup} state

Ie, it should explicitly say recompute Skew_Time, but more
importantly, the Master_Adver_Internal needs to be set before
computing Master_Down_Interval, otherwise you use the old (obsolete)
value.

Section 7.1:

- MAY verify that "Count IPvX Addrs" and the list of IPvX Address
      matches the IPvX Address(es) configured for the VRID

   If the above check fails, the receiver SHOULD log the event and MAY
   indicate via network management that a misconfiguration was detected.
   If the packet was not generated by the address owner (Priority does
   not equal 255 (decimal)), the receiver MUST drop the packet,
   otherwise continue processing.

This combination of MAY, SHOULD and MUST is confusing.  Also it's not
clear if the second "If" is conditional on the first "If".  It seems
like you may have a mandatory action that you can't do unless you do
an optional action.

Section 7.4, para 3:

   An implementation might choose simplify this for the operator by
   using the VRRP MAC in the formation of these link local addresses.

Should be "might" be a MAY?

"VRRP MAC" is not a defined term.  I think the right term is "Virtual
Router MAC Address".

This whole paragraph seemed a little vague.

Section 8.1.2:

   "When a VRRP router restarts or boots, it SHOULD not send any ARP
   messages with its physical MAC address for the IPv4 address it owns,
   it should only send ARP messages that include Virtual MAC addresses."

How do you ssh to the physical router, if you're not sure which router
you'll actually reach?  Does this require a separate IPv4 address?

      "When configuring an interface, VRRP routers should broadcast a
      gratuitous ARP request containing the virtual router MAC address
      for each IPv4 address on that interface."

Surely a VRRP router only does this when becoming the master?
Otherwise backup routers can cause traffic to be blackholed when their
interface is configured.  Similar text appears in 8.2.2.

Section 9.1.  I don't know much about FDDI.  If you do as described, I
assume there's some way to avoid the packet circulating forever?


To summarize:  I think the draft has a few issues that need to be
addressed before publication.  Most of the issues are pretty minor;
it's up to the IESG as to whether addressing them would require a new
last call, but my expection is that it would not.

That's all.  I hope this is of some help to you.

Cheers,
Mark
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>
  • tsv-dir review of draft-ietf-vrrp-unified-spec-02.txt, Mark Handley <=