Hi Mark,
I've started several threads to cover the more substantive issues -
these email threads are on VRRP list and not reproduced here.
Please see inline for the less substantive issues.
Thank you again for your detailed and thoughtful feedback.
Regards,
Steve
-----Original Message-----
From: ietf-bounces(_at_)ietf(_dot_)org
[mailto:ietf-bounces(_at_)ietf(_dot_)org] On
Behalf Of Mark Handley
Sent: Thursday, October 30, 2008 08:58
To: ietf(_at_)ietf(_dot_)org; TSV Dir; vrrp(_at_)ietf(_dot_)org
Subject: tsv-dir review of draft-ietf-vrrp-unified-spec-02.txt
I've reviewed this document as part of the transport area
directorate's ongoing effort to review key IETF documents.
These comments were written primarily for the transport area
directors, but are copied to the document's authors for their
information and to allow them to address any issues raised.
The authors should consider this review together with any
other last-call comments they receive.
Please always CC tsv-dir(_at_)ietf(_dot_)org if you reply to or forward
this review.
Generally the draft reads very well, and is clear and comprehensible.
I had no difficulty understanding the draft overall, although
occasionally terms were used before they were properly
defined. This isn't really a big deal, but I've highlighted
the below.
From a transport area point of view, the main things we're
looking for are whether the protocol will be well-behaved,
especially from the point of view of congestion control.
VRRP does not perform any form of congestion control, but as
it is purely a link-local protocol, configured by a network
operator to provide redundancy between two routers on the
same LAN segment, this is not really an issue. One presumes
that a network operator choosing sub-second advertisement
intervals knows what he or she is doing, and knows
appropriate rates for their local circumstances. Routers
provide many ways to shoot yourself in the foot, and this one
doesn't seem worth of concern.
However, the draft does have a weakness when it comes to congestion.
It does not provide any guidance to router vendors as to
whether VRRP packets should receive priority treatment when
being transmitted. The potentially problematic situation
occurs when a router is delivering more packets onto the LAN
than can be accomodated, and so a queue builds up in the
router. Typical default queuing delays tend to be some
generic wide-area RTT (so that a delay-bandwidth product of
packets can be queued). Thus packets being transmitted onto
the VRRP-protected LAN could see perhaps 100ms or more of
queuing delay.
If VRRP packets enter such a queue, and the smallest VRRP
Advertisement_Interval is configured, the
Master_Down_Interval will be between 30 and 40ms. Thus
normal queuing delays might cause a VRRP backup to conclude
that the master is down, and therefore promote itself to
master. Very shortly afterwards, the delayed VRRP packets
from the master would arrive, causing a switch back to backup status.
However this process can repeat many times per second,
causing significant disruption to traffic.
My feeling is that, if possible, VRRP packets should be
priority-forwarded onto the LAN, mitigating this problem.
However, it's not clear this is always possible. At the
least, the draft ought to comment on this possible scenario
and the risks of very low Advertisement_Interval values in
the presence of congestion.
It would presumably be possible for a VRRP master to observe
such a situation is occurring frequently. Under such
circumstances, at the least a good implementation should log
that there is a problem. It might also be possible to
specify that the master should automatically backoff its
Advertisement_Interval value when it observes such thrashing.
I'll leave it to the VRRP authors to think over whether that
might be desirable or might have unintended consequences. I
think the draft can be progressed without any such adaptive
mechanism, but the authors may wish to think about it anyway,
as it might improve VRRP's robustness.
See separate thread
Detailed comments on the spec (many of these are nit-picking,
but a few might be substantive):
Section 1.2, para 4: "virtual router's IPv4 addresses" - at this
point in reading the draft, it wasn't clear to me whether
this was any of the addresses of any of the routers
comprising the virtual router or whether it was the address
being virtualized. This became clear later on in the draft.
Will revisit & try to clarify
Section 1.3, para 4: "could be speeded up" - this is
awkward English; I'd have written "could be sped up" or
"could be made quicker"
Ok
Section 2.1, para 1: "Backup of an IPvX address(es)" - awkward English
- better would be "Backup of an IPvX address or addresses"
Ok
Section 2.1, bullet 4 "Provide for election ... load
balancing". It wasn't really clear to me what was meant by this.
See separate thread
Section 2.4, para 1: "Sending either IPvX packets on..." It
wasn't clear what this meant - do you mean "Sending either
IPv4 or IPv6 packets"?
Will revisit & consider clarifications
Section 2.5 - is the internet draft referred to now dead? If
so, this reference should probably be deleted, and the
sentence rephrased appropriately.
ok
Section 3, para 1: "Each VRRP virtual router has a single
well-known MAC address allocated to it." I understand what
this means, but I wonder if this should be scoped to indicate
it's true per link?
There's some potential for misunderstanding that the same MAC
address must be used for the same VRID on other interfaces of
the same router, which I don't think is intended.
Will revisit & consider clarifications
Section 3, para 2: I found this whole paragraph somewhat confusing.
It's trying to say too many things in too few sentences.
Will revisit & consider clarifications
Section 5.2.9, para 2: "associated with" is a rather vague term.
Maybe "being backed up by" or something similar would be clearer?
Will revisit & consider clarifications
Section 6.1. Should the VR MAC address be a parameter? I
think it should.
I think the combination of IPv4 or IPv6 and the VRID forces the VR MAC.
Section 6.1: "Accept_Mode" - the explanation here could be clearer.
An example would help.
Will revisit & consider clarifications
Section 6.4.1 - a "Startup event" is not defined. I can
guess was is meant, but it might be better to define it.
Will revisit & consider clarifications
Section 6.4.2 "Broadcast gratuitous ARP request" should
probably say "Broadcast gratuitous ARP request on that interface".
Ok
Section 6.4.3 . "Note: IPv6 Neighbor Solicitations and
Neighbor Advertisements should not be dropped when
Accept_Mode is False."
This would probably be better being stated a few lines
earlier, in the two MUST bullets. In any event, it seems
like it's a MUST not a SHOULD.
Will revisit & consider clarifications
Section 6.4.3, top of page 27. I think the actions are incorrect.
They are currently:
@ Cancel Adver_Timer
@ Recompute the Master_Down_Interval
@ Set Master_Adver_Interval to Adver Interval contained
in the ADVERTISEMENT
@ Set Master_Down_Timer to Master_Down_Interval
@ Transition to the {Backup} state
I think they should be:
@ Cancel Adver_Timer
@ Set Master_Adver_Interval to Adver Interval contained
in the ADVERTISEMENT
@ Recompute the Skew_Time
@ Recompute the Master_Down_Interval
@ Set Master_Down_Timer to Master_Down_Interval
@ Transition to the {Backup} state
Ie, it should explicitly say recompute Skew_Time, but more
importantly, the Master_Adver_Internal needs to be set before
computing Master_Down_Interval, otherwise you use the old
(obsolete) value.
See separate thread
Section 7.1:
- MAY verify that "Count IPvX Addrs" and the list of IPvX Address
matches the IPvX Address(es) configured for the VRID
If the above check fails, the receiver SHOULD log the event and MAY
indicate via network management that a misconfiguration
was detected.
If the packet was not generated by the address owner (Priority does
not equal 255 (decimal)), the receiver MUST drop the packet,
otherwise continue processing.
This combination of MAY, SHOULD and MUST is confusing. Also
it's not clear if the second "If" is conditional on the first
"If". It seems like you may have a mandatory action that you
can't do unless you do an optional action.
Will revisit & consider clarifications
Section 7.4, para 3:
An implementation might choose simplify this for the operator by
using the VRRP MAC in the formation of these link local addresses.
Should be "might" be a MAY?
Will revisit & consider clarifications
"VRRP MAC" is not a defined term. I think the right term is
"Virtual Router MAC Address".
This whole paragraph seemed a little vague.
Will fix term and will revisit & consider clarifications
Section 8.1.2:
"When a VRRP router restarts or boots, it SHOULD not send any ARP
messages with its physical MAC address for the IPv4
address it owns,
it should only send ARP messages that include Virtual MAC
addresses."
How do you ssh to the physical router, if you're not sure
which router you'll actually reach? Does this require a
separate IPv4 address?
"When configuring an interface, VRRP routers should broadcast a
gratuitous ARP request containing the virtual router MAC address
for each IPv4 address on that interface."
Surely a VRRP router only does this when becoming the master?
Otherwise backup routers can cause traffic to be blackholed
when their interface is configured. Similar text appears in 8.2.2.
See separate thread
Section 9.1. I don't know much about FDDI. If you do as
described, I assume there's some way to avoid the packet
circulating forever?
See separate thread
To summarize: I think the draft has a few issues that need
to be addressed before publication. Most of the issues are
pretty minor; it's up to the IESG as to whether addressing
them would require a new last call, but my expection is that
it would not.
That's all. I hope this is of some help to you.
I'm sure this will improve the draft.
Cheers,
Mark
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf