ietf
[Top] [All Lists]

Last Call: <draft-ietf-mboned-auto-multicast-14.txt> (Automatic Multicast Tunneling) to Proposed Standard

2012-12-26 07:04:09
Hi,

Sorry for being late with this IETF last call comments. I will partly blame the 
ADs requesting this Transport Directorate review a bit late, the other part is 
all mine and the holidays. Anyway, I do hope you will consider these issues and 
comments as I believe I found some serious ones in addition to a number of 
clarifications that should be made.


Significant Issues:

1. Congestion Control
This is clearly a tunnel establishment protocol of something that is IP 
traffic. Thus normally the responsibility for congestion control is with the 
tunnelled traffic. However, I would like argue that this does not apply in this 
case due to the nature of the tunnelled traffic, i.e. multicast traffic and 
secondary due to limitations in the tunnel protocol.

Lets start with the second part. This protocol claims to support ASM still 
don't provide a upstream delivery mechanism, i.e. an ASM receiver is not 
capable of sending as it should. This prevents several existing mechanism for 
congestion control that exist in protocols supporting multicast. The first is 
using RTCP for congestion control in ASM [RFC3550], the second is TCP-Friendly 
Multicast Congestion Control [RFC4654] that can be used in the RMT suite of 
protocols, and I know has been implemented in some NORM implementation. Thus 
only strictly receiver based mechanisms, such as Wave and Equation Based Rate 
Control [RFC3738] are available in this context.

Secondly, many multicast usages are in fact deployed without any congestion 
control. This based on that the deploying entity controls the scope and 
authorization for requesting multicast delivery. However, does restrictions 
does not apply to AMT delivery of multicast. If the gateway can reach using 
unicast the relay it can be delivered the multicast group from the domain the 
relay is attached to. Thus, this protocol changes the deployment restrictions 
of multicast which many non-congestion controlled delivery is based on. Instead 
the non-congestion controlled traffic can now sent over an IP/UDP tunnel over 
Internet where neither relay nor gateway may have any knowledge about the path 
the traffic may take.

Based on this I would like to see two changes to this protocol specification. 
First a section discussing the issue of congestion control. Secondly, I think 
this protocol should have an applicability statement limiting its deployment to 
restricted environments where the relay and gateway deployers can provide 
certain resource provisions between the entities to avoid the multicast traffic 
affecting other traffic sharing the same bottlenecks in ways not allowed by the 
network provider.

2. Security

This protocols is frank with it having limited security features and says this 
is similar to the IGMP and MLD protocols being used. However, I think this is a 
failure to propoerly consider the threat model. If one uses AMT over general 
Internet it will run in a network where the one deploying the multicast and the 
relays no longer control requirements on source address verification or 
possibilities for traffic separation as they can do within the domains where 
multicast currently are deployed. The security vulnerabilities in IGMP and MLD 
are much more contained and controllable in a LAN environment where one has 
chosen to deploy multicast compared to an Relay exposing this to the whole 
Internet. Once more I think there is only two choices here.

A) Beef up the security to general Internet threat model, i.e. at a minimal 
provide a real model for gateway authentication using identities, not only 
return routability based verifications.

B) Limit the applicability of AMT to managed environments and make it clear 
that the relay will need to limit which gateways are allowed to access the 
relay based on addressing.

Based on the first significant issue with congestion control I expect that 
there is little meaning to do A) unless also one is willing to beef up AMT to 
provide congestion control. Which I think is not according to the design wishes 
for the protocol designers.

3. Use of Zero Checksum

The AMT specification enables the use of Zero UDP checksum with IPv6, i.e.
draft-ietf-6man-udpchecksums-06<http://datatracker.ietf.org/doc/draft-ietf-6man-udpchecksums/>
 and 
draft-ietf-6man-udpzero-08<http://datatracker.ietf.org/doc/draft-ietf-6man-udpzero/>.
 Nothing against this in principal. However, I have noticed that AMT fails to 
properly address the failure modes of using a zero-checksum. AMT is a typical 
example of a protocol that actually need active verification of each tunnel 
that zero-checksum functions. This as AMT is clearly intended to be deployed 
with its Gateway part in end-hosts and residential network devices or routers. 
This means the tunnel will pass through both firewalls and NATs on its path 
between the relay and the gateway. Unless these devices are not upgraded to 
support zero-checksum in UDP for IPv6 the traffic may actually become black 
holed. The most likely is a simple firewall that has a rule for IPv6/UDP which 
doesn't allow zero checksum as it is against RFC 2460. Thus all the Multicast 
data packets will disappear on route in the tunnel. There is no mechanism in 
the AMT protocol to detect this and negotiate with the relay so that it will 
not use the zero-checksum for this tunnel.

This must be addressed as I see it. If not the AMT will be so brittle that it 
can't be used in a large number of its intended deployments.

4) MTU issues

This document total fails to discuss the issues of MTU blackholing. As the IP 
multicast datagrams as well as the encapsulated IGMP/MLD messages can with the 
added tunnel-overhead result in that the sent packet exceeds the MTU of the 
path, these packets could be black holed. This can potentially result in very 
intermittent transport behavior for the tunnel. Thus, some discussion of how do 
handle the MTU issues in this context should be introduced.

I am willing to discuss methods here, but I guess several alternatives exists 
and thus which is most appropriate and the level of AMT support for them varies 
I would like the protocol designers to do a first stab at resolving this.

Other Issues
======================================================================

A) The table of content on page 2 should include more levels of headings. Most 
likely down to 4 is needed to make the TOC usable for finding content in the 
document.

B) The claimed ASM support
I would like to better understand how one can claim to support ASM when one has 
no up stream path to inject the ASM group participants traffic. When one uses 
ASM one normally does this for a reason and needing the possibility to inject 
packets into the group. These limitations needs to be clarified.

C) Section 4:

This section indicates is its figures, the ones in Section 4.1.1 and Section 
4.1.3.1 that the Router Mode IGMP/MLD functions are outside of AMT. Which based 
on the requirements in Section 5.3.3.4 is not accurate. That implementation 
must be AMT specific to maintain the AMT tunnel to group membership handling

D) Not all figures have handles that can be used to reference.

E) Section 4.1.5.1:


   Similarly, the selection of a unicast Relay address may be source-
   dependent, as a relay contacted by a gateway to supply multicast
   traffic must have native multicast connectivity to the traffic source


I find this statement confusing. There is no support in the protocol for 
including the multicast group(s) which the gateway like to get in the discovery 
phase of the protocol.

F) AMT Gateway in home router.
One deployment scenario is that the AMT gateway is deployed in the home network 
router to provide access to multicast groups provided by the ISP. However, the 
startup procedures in this deployment is unclear. The text appears to indicate 
that one can both have a gateway implementation that as soon as the router 
boots it starts doing discovery and requests to have Queries to send to its 
internal local network. Other suggestions appears to be to wait until some host 
actually request to join a group. The protocol specification appears to do its 
best to leave very much flexibility and thus produce huge variance in the 
market.

G) Figure 3:
I find no discussion of Membership Updates to rejoin the groups after the 
tunnel has changed its source address as seen by the relay. This I think should 
have some discussion. Yes, it reasonably clear that you will get traffic just 
by sending new membership updates over the new tunnel. However, some discussion 
of the timing between teardown and this membership update should be considered. 
Figure 3 implies it should be sent after the teardown, which I think is correct 
due to the traffic volumes to the NAT most likely causing the path change.

H) Figure in Section 4.2.2.2
Propose that the external side of the NAT should be marked as the one having 
the "e" addresses.

I) Seeing the figure in Section 4.2.2.3 I definitely commented on the Address 
Collision issues. It is made somewhat clearer later on this. But, maybe an 
clearer section 4 sub-section to discuss this general issue that multiple left 
side host can have the same address as other behind other tunnel-end-points and 
thus there is need in the Relay to hide this from upstream and accept it and 
use the tunnel context to track the different hosts.

J) Section 4.2.2.3:


 To avoid placing an undue burden on the relay platform, the protocol
   specifically allows zero-valued UDP checksums on the multicast data
   messages.  This is not an issue in UDP over IPv4 as the UDP checksum
   field may be set to zero.  However, this is a problem for UDP over
   IPv6 as that protocol requires a valid, non-zero checksum in UDP
   datagrams [RFC2460].  Messages sent over IPv6 with a UDP checksum of
   zero may fail to reach the gateway.  This is a well known issue for
   UDP-based tunneling protocols that is described
   [I-D.ietf-6man-udpzero].  A recommended solution is described in
   [I-D.ietf-6man-udpchecksums].

I think this needs reformulating and I don't understand what is intended
with the last sentence.

K) Section 5.1.1.
"Destination UDP Port -  The IANA-assigned AMT port number."

I find it strange that the protocol is mandating that all traffic is sent
to the IANA assigned port. Why can't the protocol not allow more flexible 
handling
of the destination port? I find one single thing in the protocol which prevents
usage of an other relay listener port. That is that the Relay Advertisement 
would
need a port field in addition to the address.

L) Section 5.1.1.4
A 32-bit random value generated by the gateway and echoed by the
   relay in a Relay Advertisement message.

Should the above value make it clear that it preferably should be a
cryptographically random value as defined in RFC 4086?

M) There is lack of specification in Section 5.1.1 of what one does if version
is different from 0. This is mentioned in Section 5.3.3.1 but not for gateways
and not all messages types.

N) Section 5.1.4.8:

   The Querier's Query Interval Code (QQIC) field in the general query
   is used by a relay to specify the time offset a gateway should use to
   schedule a new three-way handshake to refresh the group membership
   state within the relay (current time + Query Interval).

In several places the QQIC and QRV are not made clear that this is defined in
the external references for MLD and IGMP.

O) Section 5:
When specifying the bit-fields, please indicate the length of each field in the
text. This is an accessibility question. If you have impaired vision 
interpreting
the figures field length correctly can be different.

P) Section 5.2.2.4:

This section defined what retransmission parameters that one can potentially
configure. However, the section fails to define what the max or min values that
are acceptable are. Wrongly configured retransmission parameters can have 
significant
negative impact on the network by causing bursts or unnecessary traffic.

Q) Section 5.2.3.3:
The gateway may continue to receive Multicast Data messages long
   after the gateway sends a Membership Update message that deletes
   existing group subscriptions.

What is "long" in the above sentence. Are we talking some known number of 
seconds,
a TCP MSL, i.e. 2 min?

R) Section 5.2.3.4.3
   A gateway MAY retransmit a Relay Discovery message if it does not
   receive a matching Relay Advertisement message within some timeout
   period.  If the gateway retransmits the message multiple times, the
   timeout period SHOULD be adjusted to provide an random exponential
   back-off.  The RECOMMENDED timeout is a random value in the range
   [initial_timeout, MIN(initial_timeout * 2^retry_count,
   maximum_timeout)], with a RECOMMENDED initial_timeout of 1 second and
   a RECOMMENDED maximum_timeout of 120 seconds (which is the
   recommended minimum NAT mapping timeout described in [RFC4787]).

I wonder if the above exponential backof is really what is desired. As it 
randomly
picks the timeout between initial timout value and the 
2^retry_count*initial_timeout it
will be both lower biased and also capable of producing timing intervals that 
doesn't
grow. If one desire to have random timeout to avoid some clock synchronization 
effects
I think an algorithm that is Td = MIN(initial_timeout * 2^retry_count,
   maximum_timeout) and where the actual timeout is random*Td and where random 
is a
random value from the uniform distribution in the interval [0.5,1.5]. Will both 
ensure
that the timout between two retransmissions never is less and on average grows 
with
a factor two.

This section is also not defining a minimal initial timout value, or any method 
for
safely determine a more performant value from a safe initial value. To do a RTT
measurement using the AMT control messages would require some extensions but
could be a good way of deterimining a better initial value than 0.5 seconds 
which
would be my recommendation for a default value.

S) Section 5.2.3.4.4

   If a gateway executes the relay discovery procedure at the start of
   each membership update cycle and the relay address returned in the
   latest Relay Advertisement message differs from the address returned
   in a previous Relay Advertisement message, then the gateway SHOULD
   send a Teardown message (if supported) to the old relay address,
   using information from the last Membership Query message received
   from that relay, as described in Section 5.2.3.7.  This behavior is
   illustrated in the following diagram.

This text and the figure after it does not appear to be consistent.
The figure implies a timer that isn't present in the above. The textual
description appears sensitive to flapping anycast routing. I think
the figures indication of some higher timeout before redoing Relay discovery
appears much more robust.

T) Section 5.2.3.5.3
See R)
Also this text appears redundant to previous text. Maybe generalize this
into its own section being used in general for all messages needing 
retransmission

U) Section 5.2.3.5.4
   Querier's Query Interval Code carried by the general-query.  A
   gateway MAY use a smaller timer duration if required to refresh a NAT
   mapping that would otherwise timeout.

Maybe the protocol would rather need a NAT keep-alive message to be sent
from the gateway to the relay. But maybe the Request, Query cycle is light 
weight
enough that this works fine.

V) Section 5.2.3.6.1

o  Insert IGMP or MLD datagrams into a queue for transmission after
      it receives a Membership Query message.

What assumptions of queue depth exist in the above. Clearly the messages in this
queue should expire if they become to old.

X) Section 5.2.3.7
Gateway support for the Teardown message is OPTIONAL but RECOMMENDED.

The above is a very strange usage of RFC 2119 keywords. IF you use the synonyms
then maybe the error of writing it this way is clear.
Gateway MAY support for the Teardown message but SHOULD.

Y) The usage of retransmission versus repetitions are not always clear.
Some of the messages appears to simply need to be repeated QRV number of times 
with
some interval. Others should really be matched with an answer and if not 
received within
timeout retransmitted. Can these two cases be made more clear?

Z) Section 5.3.5
   The hash function RECOMMENDED for use in computing the Response MAC
   is the MD5 hash digest [RFC1321], though hash functions or keyed-hash
   functions of greater cryptographic strength may be used.

I think this points to a security vulnerability. I think it needs to be made 
clear
that the MAC MUST be keyed. If it is just a digest, then an attacker can 
calculate the
MAC and perform an off-path attack.

This should be made clear also in Section 6.1 to be a requirement.

AA) Section A.1:

Altough this proposals has its advantages I think it might also illustrate a
short-coming. First of all 48-bits is quite short for a MAC. I would prefer a
variable length field.

Secondly, doesn't this actually create more material for an attacker to 
determine
the key used by the relay?

That was all I have found.

Cheers

Magnus

<Prev in Thread] Current Thread [Next in Thread>
  • Last Call: <draft-ietf-mboned-auto-multicast-14.txt> (Automatic Multicast Tunneling) to Proposed Standard, Magnus Westerlund <=