Re: Last Call: draft-ietf-l3vpn-2547bis-mcast (Multicast in MPLS/BGP IP

On Tue, 25 Aug 2009, The IESG wrote:

The IESG has received a request from the Layer 3 Virtual Private Networks
WG (l3vpn) to consider the following document:

- 'Multicast in MPLS/BGP IP VPNs '
  <draft-ietf-l3vpn-2547bis-mcast-08.txt> as a Proposed Standard


This is an assigned ops-dir review.  I'm familiar with multicast, PIM and
routing, but not intimate with L3VPNs, so bear with me.  I only did a
superficial review, not looking very deep into the details of the spec.

As a high-level comment I'd say that this is more of a frameworkdocument than an actual specification. The reason is that for almostevery feature, the spec has 2-4 different mechanisms for achieving it.Most of these are usually non-interoperable. This is less useful thana homogenous spec, but I believe the horse has already left the barnand now it's up to the vendors to implement everything and theoperators to decide what they need to use in order to get the resultsneeded.

From the operational perspective, there are a lot of options, policy

and configuration. This can be good and flexible, but it has thedrawback that configuration is complex, and the service will often bemisconfigured, with few if any means to detect misconfigurations.Many mechanisms also require the service provider to know the trafficpatterns of their customers (i.e., which groups should use whichtraffic pattern/multicast routing optimizations). This is a challengeand a lot of work. In some places of the spec it is also not clearwhether it's the operator or implementor which must do X (e.g., ensurethat all the required BGP attributes are included in signalling invarious scenarios and cases). Some of the configurables are listedbelow as an example:


 - various signalling mechanisms (BGP, PIM, RSVP-TE, mLDP, ...)
 - various PMSI interfaces used and provided (which groups, MVPNs use which
   technologies), e.g. the policy scenarios described in S 7.
 - correct configuration of various BGP attributes
 - configuration of aggregation tree (sharing P-multicast trees across
   MVPNs) [e.g. S 6.3.2: "This will allow a SP to
   deploy aggregation depending on the MVPN membership and traffic
   profiles in its network."]
 - RP addresses of each C-multicast trees, unless automatically learned with
   BSR or auto-rp (note that BCP in this field is to use manual config)
 - whether explicit tracking is enabled (on per-flow basis)
 - PHP configuration for PMSI LSPs must be disabled (S 12.1.3)

Some of bigger issues noted are below:

1) IPv6 support.  The spec apparently aims to support both IPv4 and IPv6
   because it refers to both in a couple of places.  Yet, there is at
   least one explicit place in the spec (S 7.4.2.2) that's not compatible.
   I suspect many of the BGP attributes used, possibly also the MCAST-VPN
   BGP SAFI and others are not IPv6 compatible.  At the minimum, the status
   (intent) of the spec should be clarified. Even better would be to
   improve and include the support here.

2) RP configuration in SP network.  It's not clear if SP network needs to
   know how customer sites have configured their RPs (when the customer
   provides the RP).  At least traditional PIM signalling would require SP
   to know this.  But if auto-rp or BSR is not used by the customer, how is
   this information learned and maintained?  Would it require manual
   configuration?

3) S 3.4.1.2 and 3.4.1.3 describe "Lightweight PIM Peering Across a MI-PMSI"
   and "Unicasting of PIM C-Join/Prune Messages".  These are inadequately
   specified and in conflict with PIM-SM specification.  Given that these
   are already practically out of scope of the specification, these sections
   and text that relates to this should be removed.

4) Explicit tracking in S 5.4.2.  Philosophical. Using BGP such that
   upon the receipt of type-specific new information X it is required to
   perform some, timing-sensitive other action Y seems wrong to me.
   Has this application of BGP been adequately reviewed in IDR WG?

5) Active source BGP messages.  This is a duplication of a similar mechanism
   in MSDP (RFC3618) which has caused much gried in Internet.  Does this
   meant that when a host does 'nmap -sU 224.0.0.0/4' at a VPN site, this
   will result in about 268 million BGP active source updates being sent
   (2^28) in the SP backbone?  This problem is not described in security
   considerations.

6) PIM-BIDIR usage.  May the SP use PIM-BIDIR internally even if the customer
   interface would use PIM-SM?  The assumption when BIDIR is applicable is not
   clear.  Given that using BIDIR-PIM is an all-or-nothing solution, the
   only operationally feasible model would appear to be that that the
   {SM,BIDIR} operational modes used by the customer and SP must be
   independent from each other.  Is this framework compatible with that?

7) Type 0 Route Distinguisher.  The spec mandates using type 0 RD which
   embeds 16-bit AS-number.  Another type exists for 32-bit ASN, but it is
   not clear if interoperable service could be achieved in practise
   with a simple modification.

Some of these and a couple of additional issues below:

Clarifying questions/substantial
--------------------------------

3.2. P-Multicast Service Interfaces (PMSIs)

   Multicast data packets received by a PE over a PE-CE interface must
   be forwarded to one or more of the other PEs in the same MVPN for
   delivery to one or more other CEs.

.. is this strictly accurate?  doesn't this depend on where the RP is
configured to be?  This seems to assume that the RP configuration is always
provided by the customer, never by SP?  Because if RP is provided by the
service provider, then the same packets could be forwarded back to the CE,
without being forwarded at all to other PEs.

In 3.2,

   - PIM

       A PMSI can be instantiated as (a set of) Multicast Distribution
       Trees created by the PIM P-instance ("P-trees").

       The multicast distribution trees that instantiate I-PMSIs may be
       either shared trees or source-specific trees.

... I-PMSI with PIM is not really described; this part only describes
S-PMSI's.  Is this intentional? (FWIW, S5.2 only discusses MI-PMSI)

3.4.1.2. Lightweight PIM Peering Across a MI-PMSI

... This section describes variations from PIM specifications, and I'd
suggest it to be removed because it's incomplete and it's not clear if these
mechanisms are yet supported by PIM. (Note that periodic Hellos also serve
the function of removing, not just discovering, dead adjacencies.)

3.4.1.3. Unicasting of PIM C-Join/Prune Messages

   PIM does not require that the C-Join/Prune messages that a PE
   receives from a CE to be multicast to all the other PEs; it allows
   them to be unicast to a single PE, the one that is upstream on the
   path to the root of the multicast tree mentioned in the Join/Prune
   message.

... this appears to be incorrect or depend on the deprecated or frowned upon
features of PIM.  The PIM-SM spec says that Joins should only be accepted
from neighbors from which you've seen a Hello message.  And hello messages
are multicast.  Also see the table on S 3.9 of RFC4601.

Consequently these two are not the main mechanisms used anyway, and Section
5 only discusses the first and the fourth one.

5.3.2. Explicit Tracking
...
   Whenever a PE sends an A-D route with a PMSI Tunnel attribute, it can
   set a bit in the PMSI Tunnel attribute indicating "Leaf Information
   Required".  A PE that installs such an A-D route MUST respond by
   generating a a Leaf A-D route, indicating that it needs to join (or
   be joined to) the specified PMSI tunnel. Details can be found in
   [MVPN-BGP].

... using BGP as a signalling mechanism that requires these kind of triggers
(receive type 1 information, must act in type-specific manner X and doing so
is timing-sensitive), seems like a significant variance from how BGP has been
used in the past.  As a philosophic point, I'd like to avoid this kind of
behaviour.

6.4.2 PIM Trees

The RP or RPA
   corresponding to the P-group address is not specified.  It must of
   course be known to all the PEs.  It is presupposed that the PEs use
   one of the methods for automatically learning the RP-to-group
   correspondences (e.g., Bootstrap Router Protocol [BSR]), or else that
   the correspondence are configured.

.. a clarification?  Do P or PE routers need to know the RP mappings of
C-trees, i.e., when the customer has configured one of its own routers as
the RP, does SP need to know the IP address?  Having to know this would be a
major configuration hassle.  The recommendation has been to use anycast-rp
and manual config but this doesn't scale if the SP routers would need to
learn this as well..

6.4.5 Ingress replication

   This draft defines a number of optional procedures which require that
   a tunnel egress, upon receiving a tunnel packet, be able to identify
   the tunnel ingress.  Unicast P-tunnel mechanisms which do not provide
   this property (e.g., multi-hop LSPs for which penultimate-hop-popping
   is done) should therefore be used with caution.  In particular, such
   mechanisms MUST NOT be used unless either (a) PIM over MI-PMSI is
   being used for distributing PE-PE C-multicast routing information, or
   (b) the procedures of section 9 are being followed.

.. who is the target audience of this 'caution' and 'MUST NOT'?  An
implementer, operator or both?  It is not clear enough if this is
actionable if it's implementer; it's not clear if this is clear enough if
it's the operator.

7.4.2.2. Packet Formats and Constants
...
   C-Source (32 bits): the IPv4 address of the traffic source in the
   VPN.

   C-Group (32 bits): the IPv4 address of the multicast traffic
   destination address in the VPN.

   P-Group (32 bits): the IPv4 group address that the PE router is going
   to use to encapsulate the flow (C-Source, C-Group).

.. this seems to be the only place in the spec that is explicitly non-IPv6
compatible.  However, some of the other properties used are implicitly not
IPv6 compatible (VPN-MCAST address family? various BGP attributes, ...)
However, e.g. S 12.1.1 would make it seem that at least parts of the spec
have tried to be IPv6-ready.

S 8.2.1.1

      a. A Route Distinguisher for the MVPN. For a given MVPN each ASBR
         in the AS must use the same RD when advertising this
         information to other ASBRs. To accomplish this all the ASBRs
         within that AS, that are configured to support the MVPN, MUST
         be configured with the same RD for that MVPN. This RD MUST be
         of Type 0, MUST embed the autonomous system number of the AS.

... type 0 -- is this compatible with 4B AS-numbers?  Even if you allowed
either Type 0 or Type 2, would this result in an interoperable service if
some routers would only support 0?

8.2.1.2.1. Inter-AS A-D Route received via EBGP

      b) Re-advertises the received inter-AS A-D route to its EBGP
         peers, other than the EBGP neighbor from which the best inter-
         AS A-D route was received.

... subject to BGP policy configuration, right?  (e.g. you don't advertise
to EBGP neighbors you don't provide transit for)

12.1.2. Encapsulation in IP

   IP-in-IP [RFC1853] is also a viable option.

.. 1853 is informational.  RFC2003 is standards track version of IP-in-IP
spec.  Should that be used here, or is there a reason for referring to an
older version?

12.3.2. For Support of PIM-BIDIR C-Groups


   As will be discussed in section 11, when a packet belongs to a PIM-
   BIDIR multicast group, the set of PEs of that packet's VPN can be
   partitioned into a number of subsets, where exactly one PE in each
   partition is the upstream PE for that partition.  When such packets
   are transmitted on a PMSI, then unless the procedures of section
   12.2.3 are being used, ...

.. there is no section 12.2.3, what are you referring to?

12.4.1. MTU (Maximum Transmission Unit)


   It is the responsibility of the originator of a C-packet to ensure
   that the packet is small enough to reach all of its destinations,
   even when it is encapsulated within IP or GRE.

.. I think you're refererring to the host at a customer site, right?
Assigning "responsibility" this way is understandable, but it is in conflict
with the IP service model, where the originator has no such responsibility.

S 13:

   An ASBR may receive, from one SP's domain, an mLDP, PIM, or RSVP-TE
   control message that attempts to extend a multicast distribution tree
   from one SP's domain into another SP's domain.  The ASBR should not
   allow this unless explicitly configured to do so.

... what about BGP signalling mechanisms?  Which attributes should/could be
filtered and how?

   In particular, an implementation SHOULD provide mechanisms that allow
   a SP to place limitations on the following:

... this list should include the source active A-D routes and similar
signalling which is triggered by a host sending into a random group.








editorial
---------

In S 2.2.1, the term "C-tree" is introduced without a prior explanation. (An
explanation of a kind appears to be in S 3.1)

In S 2.2.1,
   Shared C-trees may be unidirectional or
   bidirectional; in the latter case the multicast routing protocol is
   presumed to be the BIDIR-PIM [BIDIR-PIM] "variant" of PIM-SM.

.. what if this assumption is not valid?  Maybe this presumption should be
removed if it isn't relevant in any case?

   In order to support the "Carrier's Carrier" model of [RFC4364], mLDP
   (Label Distribution Protocol Extensions for Multipoint Label Switched
   Paths) [MLDP] may also be supported on the PE-CE interface.  The use
   of mLDP on the PE-CE interface is described in [MVPN-BGP].  The use
   of BGP on the PE-CE interface is not within the scope of this
   document.

... the context is unclear why BGP on PE-CE is even mentioned here, given
that the first sentences only mention mLDP.  Maybe this ambiguity is
stemming from the use of "may also be supported" in the third line.  If CoC
is used, are there alternatives to mLDP?  (The same text can be found in
S 3.1 and the same comment applies)

S 2.2.2:

An extreme case is when the Sender Sites
   set is the same as the Receiver Sites set, in which case all sites
   could originate and receive multicast traffic from each other.

... actually the latter does not follow.  A sender set could be a subset of
sites.  Two sids of an 'extreme' case would be that only one site is both
the sender and receiver (I wonder how this would show up for the MVPN SP?),
or that all sites are senders and receivers.

S 2.2.6

   Autonomous System (AS).  These are know as Inter-AS VPNs. In an

s/know/known/

S 6.3.2
   the maximum amount of unwanted MVPNs hat a particular PE may receive
   traffic for.

s/hat/that/

S 9.1.1

   If an I-PMSI is used for carrying the packets, the I-PMSI spans
   multiple ASes, and the I-PMSI is realized via segmented inter-AS
   P-tunnels, if C-S or C-RP is multi-homed to different PEs, as long as
   each such PE is in a different AS, the egress PE can detect duplicate
   traffic as such duplicate traffic will arrive on a different (inter-
   AS) P-tunnel.

.. I had difficulties following this pretty long sentence, break it up?

S 11.2.1

   (As always we use the "C-" prefix to indicate that we are referring
   to an address in the VPN's address space rather than in the
   provider's address space.)

.. this comment appears to be redundant, as it was already noted earlier in
the document, and these acronyms have been used hundreds of times by now..

12.2.2. Encapsulation in IP


   Rather than the IP-in-IP encapsulation discussed in section 12.1.2,
   we use the MPLS-in-IP encapsulation.  This is specified in [MPLS-IP].
   The IP protocol number MUST be set to the value identifying the
   payload as an MPLS unicast packet. (There is no "MPLS multicast
   packet" protocol number.)

.. the IANA registry only seems to have "MPLS-in-IP" (value 137).  Is this
the one?  Given that 'unicast' is not mentioned in the registry, should you
just say "MPLS-in-IP" and/or the decimal value here?

12.3.2. For Support of PIM-BIDIR C-Groups

   As will be discussed in section 11, when a packet belongs to a PIM-

... s/will be/was/ ?

S 13

   A PE MUST NOT accept BGP routes of the MCAST_VPN address family from
   a CE.

.. s/_/-/

Rosen & Raggarwa                                               [Page 92]

.. s/Raggarwa/Aggarwal/
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf

Re: Last Call: draft-ietf-l3vpn-2547bis-mcast (Multicast in MPLS/BGP IP VPNs) to Proposed Standard