blog
    Inter-AS mVPNs: MDT SAFI, ...
    17 January 10

    Inter-AS mVPNs: MDT SAFI, BGP Connector & RPF Proxy Vector

    Posted byPetr Lapukhov
    facebooktwitterlinkedin
    news-featured

    Abstract:

    Inter-AS Multicast VPN solution introduces some challenges in cases where peering systems implement BGP-free core. This post illustrates a known solution to this problem, implemented in Cisco IOS software. The solution involves the use of special MP-BGP and PIM extensions. The reader is assumed to have understanding of basic Cisco's mVPN implementation, PIM-protocol and Multi-Protocol BGP extensions.

    Abbreviations used

    mVPN – Multicast VPN
    MSDP – Multicast Source Discovery Protocol
    PE – Provider Edge
    CE – Customer Edge
    RPF – Reverse Path Forwarding
    MP-BGP – Multi-Protocol BGP
    PIM – Protocol Independent Multicast
    PIM SM – PIM Sparse Mode
    PIM SSM – PIM Source Specific Multicast
    LDP – Label Distribution Protocol
    MDT – Multicast Distribution Tree
    P-PIM – Provider Facing PIM Instance
    C-PIM – Customer Facing PIM Instance
    NLRI – Network Layer Rechability Information

    Inter-AS mVPN Overview

    A typical "classic" Inter-AS mVPN solution leverages the following key components:

    • PIM-SM maintaining separate RP for every AS
    • MSDP used to exchange information on active multicast sources
    • (Optionally) MP-BGP multicast extension to propagate information about multicast prefixes for RPF validations

    With this solution, different PEs participating in the same MDT discover each other by joining the shared tree towards the local RP and listening to the multicast packets sent by other PEs. Those PEs belong to the same or different Autonomous Systems. In the latter case, the sources are discovered by the virtue of MSDP peering. This scenario assumes that every router in the local AS has complete routing information about multicast sources (PE’s loopback addresses) residing in other system. Such information is necessary for the purpose of RPF check. In turn, this leads to the requirement of running BGP on all P routers OR redistributing MP-BGP (BGP) multicast prefixes information into IGP. The redistribution approach clearly has limited scalability, while the other method requires enabling BGP on the P routers, which nullifies with the idea of BGP-free core.

    A solution alternative to running PIM in Sparse Mode would be using PIM SSM, which relies on out-of-band information for multicast sources discovery. For such case, Cisco released a draft proposal listing new MP-BGP MDT SAFI that is used to propagate MDT information and associated PE addresses. Let’s make a short detour to MP-BGP to get a better understanding of SAFIs.

    MP-BGP Overview

    Recall the classic BGP UPDATE message format. It consists of the following sections: [Withdrawn prefixes (Optional)] + [Path Attributes] + [NLRIs]. The Withdrawn Prefixes and NLRIs are IPv4 prefixes, and their structure does not support any other network protocols. The Path Attributes (e.g. AS_PATH, ORIGIN, LOCAL_PREF, NEXT_HOP) are associated with all NLRIs; prefixes sharing different set of path attributes should be carried in a separate UPDATE message. Also, notice that NEXT_HOP is an IPv4 address as well.

    In order to introduce support for non-IPv4 network protocols into BGP, two new optional transitive Path Attributes have been added to BGP. The first attribute is known as MP_REACH_NLRI, has the following structure: [AFI/SAFI] + [NEXT_HOP] + [NLRI]. Both NEXT_HOP and NLRI are formatted according to the protocol encoded via AFI/SAFI that stands for Address Family Identifier and Subsequent Address Family Identifier respectively. For example, this could be an IPv6 or CLNS prefix. Thus, all information about non-IPv4 prefixes is encoded in a new BGP Path Attribute. A typical BGP Update message that contains MP_REACH_NLRI attributes would have no “classic” NEXT_HOP attribute and no “Withdrawn Prefixes” or “NLRIs” found in normal UPDATE messages. For the next-hop calculations, a receiving BGP speaker should use the information found in MP_REACH_NLRI attribute. However, the multi-protocol UPDATE message may contain other BGP path attributes such as AS_PATH, ORIGIN, MED, LOCAL_PREF and so on. However, this time those attributes are associated with the non-IPv4 prefixes found in all attached MP_REACH_NLRI attributes.

    The second attribute, MP_UNREACH_NLRI has format similar to MP_REACH_NLRI but lists the “multi-protocol” addresses to be removed. No other path attributes need to be associated with this attribute, an UPDATE message may simply contain the list of MP_UNREACH_NLRIs.

    The list of supported AFIs may be found in RFC1700 (though it’s obsolete now, it is still very informative) – for example, AFI 1 stands for IPv4, AFI 2 stands for IPv6 etc. The subsequent AFI is needed to clarify the purpose of the information found in MP_REACH_NLRI. For example, SAFI value of 1 means the prefixes should be used for unicast forwarding, SAFI 2 means the prefixes are to be used for multicast RPF checks and SAFI 3 means the prefixes could be used for both purposes. Last, but not least – SAFI of 128 means MPLS labeled VPN address.

    Just as a reminder, BGP process would perform separate best-path election process for “classic” IPv4 prefixes and every AFI/SAFI pair prefixes separately, based on the path attributes. This allows for independent route propagation for the addresses found in different address families. Since a given BGP speaker may not support particular network protocols, the list of supported AFI/SAFI pairs is advertised using BGP capabilities feature (another BGP extension), and the particular network protocol information is only propagated if both speakers support it.

    MDT SAFI Overview

    Cisco drafted a new SAFI to be used with regular AFIs such as IPv4 or IPv6. This SAFI is needed to propagate MDT group address and the associated PE’s Loopback address. The format for AFI 1 (IPv4) is as follows:

    MP_NLRI = [RD:PE’s IPv4 Address]:[MDT Group Address],
    MP_NEXT_HOP = [BGP Peer IPv4 Address].

    Here RD is the RD corresponding to the VRF that has the MDT configured and “IPv4 address” is the respective PE router’s Loopback address. Normally, per Cisco rules this is the same Loopback interface used for VPNv4 peering, but this could be changed using the VRF-level command bgp next-hop.

    If all PEs in the local AS exchange this information and pass it to PIM SSM, the P-PIM (Provider PIM, facing the SP core) process will be able to build an (S,G) trees for the MDT group address towards the other PE’s IPv4 addresses. This is by the virtue of the fact that the PE’s IPv4 address is known via IGP, as all PEs are in the same AS. There are no problems using the BGP-free core for intra-AS mVPN with PIM-SSM. Also, it’s worth mentioning that a precursor to MDT was as special extended community used along with VPNv4 address family. MP-BGP would use RD value of 2 (not applicable to any unicast VRF) to transport the associated PE’s IPv4 address along with an extended community that contains the MDT group address. This solution allowed for the “bootstrap” information propagation inside a single AS, since the extended-community was non-transitive. Using extended communities allowed for PIM SSM discovery information distribution inside a single AS. This temporary solution was replaced by the MDT SAFI draft.

    Next, consider the case of Inter-AS VPN where at least one AS uses BGP-free core. When two peering Autonomous Systems activate the IPv4 MDT SAFI, the ASBRs will advertise all information learned from PE’s to each other. The information will further propagate down to each AS’s PEs. Next, the P-PIM processes will attempt to build (S, G) trees towards the PE IP addresses in neighboring systems. Even though the PEs may know the other PE's addresses (e.g. if Inter-AS VPN Option C is being used), the P-routers don’t have this information. If Inter-AS VPN Option B is in use, even the PE routers will have no proper information to build the (S, G) trees.

    RPF Proxy Vector

    The solution to this problem uses a modification to PIM protocol and RPF check functionality. Known as RPF Proxy Vector, it defines a new PIM TLV that contains the IPv4 address of the “proxy” router used for RPF checks and as an intermediate destination for PIM Joins. Let’s see how it works in a particular scenario.

    On the diagram below you can see AS 100 and AS 200 using Inter-AS VPN Option B to exchange VPNv4 routes. PEs and ASBRs peer via BGP and exchange VPNv4 and IPv4 MDT SAFI prefixes. For every external prefix relayed to its own PEs, the ASBRs would change the next-hop found MP_REACH_NLRI to its local Loopback address. For VPNv4 prefixes, this achieves the goal of terminating the LSP on the ASBR. For MDT SAFI prefixes, this procedure sets the IPv4 address to be used as “proxy” in PIM Joins.

    mvpn-inter-as-basic

    Let’s say that MDT group used by R1 and R5 is 232.1.1.1. When R1 receives the MDT SAFI update with the MDT value of 200:1:232.1.1.1, PE IPv4 address 20.0.5.5 and the next-hop value of 10.0.3.3 (R3’ Loopback0 interface) it will pass this information down to PIM process. The PIM process will construct a PIM Join for group 232.1.1.1 towards the IP address 20.0.5.5 (not known in AS 100) and insert a proxy vector value of 10.0.3.3. The PIM process will then use the route to 10.0.3.3 to find the next upstream PIM peer to send the Join message to. Every P-router will process the PIM Join message with the proxy vector, and use the proxy IPv4 address to relay the message upstream. As soon as the message reaches the proxy router (in our case it’s R3), the proxy vector is being removed and the PIM Joins propagate further using the regular procedure, as the domain behind the proxy is supposed to have visibility of the actual Join target.

    In addition to use the proxy vector to relay the PIM Join upwards, every routers creates a special mroute state for the (S,G) pair where S is the PE IPv4 address and G is the MDT group. This mroute state will have the proxy IPv4 address associated with it. When a matching multicast packet going from the external PE towards the MDT address hits the router, the RPF check will be performed based on the upstream interface associated with the proxy IPv4 address, not the actual source IPv4 address found in the packet. For example, in our scenario, R2 would have an mroute state for (20.0.5.5, 232.1.1.1) with the proxy IPv4 address of 10.0.3.3. All packets coming from R5 to 232.1.1.1 will be RPF checked based on the upstream interface towards 10.0.3.3.

    Using the above-described “proxy-based” procedure, the P routers may successfully perform RPF checks for packets with the source IPv4 addresses not found in the local RIB. The tradeoff is the amount of multicast state information that has to be stored in the P-routers memory – it’s going to be proportional to the number of PEs multiplied by number of mVPN in the worst-case scenario where every PE’s participate in every mVPN. There could be more multicast route states in situations where Data MDT are being used in addition to the Default MDT.

    BGP Connector Attribute

    Another additional piece of information is needed for “intra-VPN” operations: joining a PIM tree towards a particular IP address inside a VPN and performing an RPF check inside the VPN. Consider the use of Inter-AS VPN Option B, where VPNv4 prefixes have their MP_REACH_NLRI next-hop changed to the local ASBR’s IPv4 address. When a local PE receives a multicast packet on the MDT tunnel interface it decapsulates it and performs a source IPv4 address lookup inside the VRF’s table. Based on MP-BGP learned routes, the next-hop would point towards the ASBR (Option B), while the packets might be coming across a different inter-AS link running multicast MP-BGP peering. Thus, relying solely on the unicast next-hop may not be sufficient for Inter-AS RPF checks.

    For example, look at the figure below, where R3 and R4 run MP-BGP for VPN4 while R4 and R6 run multicast MP-BGP extension. R1 peers with both ASBRs and learns VPNv4 prefixes from R3 while it learns as MDT SAFI information and PE’s IPv4 addresses from R6. PIM is enabled only on the link connecting R6 and R4.

    mvpn-inter-as-diverse-paths

    In this situation, the RPF lookup would fail, as MDT SAFI information is exchanged across the link running M-BGP, while VPNv4 prefixes next-hop point to R3. Thus, a method is required to preserve the information for the RPF lookup.

    Cisco suggested the use of a new, optional transitive attribute named BGP Connector to be exported along with VPNv4 prefixes out of PE hosting an mVPN. This attribute contains the following two components: [AFI/SAFI] + [Connector Information] and in general defines information needed by network protocol identified by AFI/SAFI pair to connect to the routing information found in MP_REACH_NLRI. If AFI=IPv4 and SAFI=MDT, the connector attribute contains the IPv4 address of the router originating the prefixes associated with the VRF that has an MDT configured.

    The customer-facing PIM process (C-PIM) in the PE routers will use information found in the BGP Connector attribute to perform the intra-VPN RPF check as well as to find the next-hop to send PIM Joins. Notice the C-PIM Joins do not need to have the RPF proxy vector piggy-backed in the PIM messages, as those are transported inside MDT Tunnel towards the remote PEs.

    You may notice that the use of BGP Connector attribute eliminates the need for special “VPNv4 multicast” address family that could be used to transport RPF check information for VPNv4 prefixes. The VPNv4 multicast address family is not really needed as the multicast packets are tunneled through SP cores using MDT Tunnel and using the BGP connector is sufficient for RPF-checks at the provider edge. However, the use of mBGP is still needed in situations where diverse unicast and multicast transport paths are used between the Autonomous Systems.

    Case Study

    Let’s put all concepts to test in a sample scenario. Here, two autonomous systems AS #100 and #200 peer using Inter-AS VPN Option B to exchange VPNv4 prefixes across the link between R3-R4. At the same time, multicast prefixes are exchanged across the peering link R6-R4 along with MDT SAFI attributes. AS 100 implements BGP-free core, so R2 does no peer via BGP with any other routers and only uses OSPF for IGP prefixes exchange. R1 peers with both ASBRs: R3 and R6 via MP-BGP for the purpose of VPNv4 and multicast prefixes and MDT SAFI exchange.

    mvpn-case-study

    Here on the diagram the links highlighted with orange color are enabled for PIM SM and may carry the multicast traffic. Notice that MPLS traffic and multicast traffic take different paths between the systems. The following are the main R1’s configuration highlights:

    • VRF RED configured with MDT of 232.1.1.1. PIM SSM configured for P-PIM instance using the default group range of 232/8.
    • The command ip multicast vrf RED rpf proxy rd vector
      ensures the use of RPF proxy vector for the MDT built for VRF RED. Thus the MDT tree built using P-PIM would make use of RPF proxy vector. Notice that the command ip multicast rpf proxy vector applies to the Joins received in the global routing table and is typically seen in the P-routers.
    • R1 exchanges VPNv4 prefixes with R3 and multicast prefixes with R6 via BGP. At the same time R1 exchanges IPv4 MDT SAFI with R6 to learn the MDT information from AS 200.
    • Connected routes from VRF RED are redistributed into MP-BGP.
    R1:
    hostname R1
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shutdown
    !
    ip multicast-routing
    ip pim ssm default
    !
    interface Serial 2/0.12 point-to-point
    ip address 10.0.12.1 255.255.255.0
    frame-relay interface-dlci 102
    mpls ip
    ip pim sparse-mode
    !
    interface Loopback0
    ip pim sparse-mode
    ip address 10.0.1.1 255.255.255.255
    !
    router ospf 1
    network 10.0.12.1 0.0.0.0 area 0
    network 10.0.1.1 0.0.0.0 area 0
    !
    router bgp 100
    neighbor 10.0.3.3 remote-as 100
    neighbor 10.0.3.3 update-source Loopback 0
    neighbor 10.0.6.6 remote-as 100
    neighbor 10.0.6.6 update-source Loopback 0
    address-family ipv4 unicast
    no neighbor 10.0.3.3 activate
    no neighbor 10.0.6.6 activate
    address-family vpnv4 unicast
    neighbor 10.0.3.3 activate
    neighbor 10.0.3.3 send-community both
    address-family ipv4 mdt
    neighbor 10.0.6.6 activate
    address-family ipv4 multicast
    neighbor 10.0.6.6 activate
    network 10.0.1.1 mask 255.255.255.255
    address-family ipv4 vrf RED
    redistribute connected
    !
    no ip domain-lookup
    !
    ip multicast vrf RED rpf proxy rd vector
    !
    ip vrf RED
    rd 100:1
    route-target both 200:1
    route-target both 100:1
    mdt default 232.1.1.1
    !
    ip multicast-routing vrf RED
    !
    interface FastEthernet 0/0
    ip vrf forwarding RED
    ip address 192.168.1.1 255.255.255.0
    ip pim dense-mode
    no shutdown

    Notice that in the configuration above, Loopback0 interface is used as the source for the MDT tunnel, and therefore has to have PIM (multicast routing) enabled on it. Next in turn, R2’s configuration is straightforward – OSPF used for IGP, adjacencies with R1, R3 and R3 and label exchange via LDP. Notice that R2 does NOT run PIM on the uplink to R3, and does NOT run LDP with R6. Effectively, the path via R6 is used only for multicast traffic while the path across R3 is used only for MPLS LSPs. R2 is configured for PIM-SSM and RPF proxy vector support for global routing table.

    R2:
    hostname R2
    !
    no ip domain-lookup
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shut
    !
    ip multicast-routing
    !
    interface Serial 2/0.12 point-to-point
    ip address 10.0.12.2 255.255.255.0
    frame-relay interface-dlci 201
    mpls ip
    ip pim sparse-mode
    !
    ip pim ssm default
    !
    interface Serial 2/0.23 point-to-point
    ip address 10.0.23.2 255.255.255.0
    frame-relay interface-dlci 203
    mpls ip
    !
    interface Serial 2/0.26 point-to-point
    ip address 10.0.26.2 255.255.255.0
    frame-relay interface-dlci 206
    ip pim sparse-mode
    !
    interface Loopback0
    ip address 10.0.2.2 255.255.255.255
    !
    ip multicast rpf proxy vector
    !
    router ospf 1
    network 10.0.12.2 0.0.0.0 area 0
    network 10.0.2.2 0.0.0.0 area 0
    network 10.0.23.2 0.0.0.0 area 0
    network 10.0.26.2 0.0.0.0 area 0

    R3 is the ASBR used for implementing Inter-AS VPN Option B. It peers via BGP with R1 (the PE) and R4 (the other ASBR). Only VPNv4 address family is enabled with the BGP peers. Notice that the next-hop for VPNv4 prefixes is changed to self, in order to terminate the transport LSP from the PE on R3. No multicast or MDT SAFI information is exchanged across R3.

    R3:
    hostname R3
    !
    no ip domain-lookup
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shut
    !
    interface Serial 2/0.23 point-to-point
    ip address 10.0.23.3 255.255.255.0
    frame-relay interface-dlci 302
    mpls ip
    !
    interface Serial 2/0.34 point-to-point
    ip address 172.16.34.3 255.255.255.0
    frame-relay interface-dlci 304
    mpls ip
    !
    interface Loopback0
    ip address 10.0.3.3 255.255.255.255
    !
    router ospf 1
    network 10.0.23.3 0.0.0.0 area 0
    network 10.0.3.3 0.0.0.0 area 0
    !
    router bgp 100
    no bgp default route-target filter
    neighbor 10.0.1.1 remote-as 100
    neighbor 10.0.1.1 update-source Loopback 0
    neighbor 172.16.34.4 remote-as 200
    address-family ipv4 unicast
    no neighbor 10.0.1.1 activate
    no neighbor 172.16.34.4 activate
    address-family vpnv4 unicast
    neighbor 10.0.1.1 activate
    neighbor 10.0.1.1 next-hop-self
    neighbor 10.0.1.1 send-community both
    neighbor 172.16.34.4 activate
    neighbor 172.16.34.4 send-community both

    The second ASBR in AS 100 – R6, could be characterized as the multicast-only ASBR. In fact, this ASBR is only used to exchange prefixes in multicast and MDT SAFI address families with R1 and R4. MPLS is not enabled on this router, and its sole purpose is multicast forwarding between AS 100 and AS 200. There is no need to run MSDP as PIM SSM is used for multicast trees construction.

    R6:
    hostname R6
    !
    no ip domain-lookup
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shut
    !
    ip multicast-routing
    ip pim ssm default
    ip multicast rpf proxy vector
    !
    interface Serial 2/0.26 point-to-point
    ip address 10.0.26.6 255.255.255.0
    frame-relay interface-dlci 602
    ip pim sparse-mode
    !
    interface Serial 2/0.46 point-to-point
    ip address 172.16.46.6 255.255.255.0
    frame-relay interface-dlci 604
    ip pim sparse-mode
    !
    interface Loopback0
    ip pim sparse-mode
    ip address 10.0.6.6 255.255.255.255
    !
    router ospf 1
    network 10.0.6.6 0.0.0.0 area 0
    network 10.0.26.6 0.0.0.0 area 0
    !
    router bgp 100
    neighbor 10.0.1.1 remote-as 100
    neighbor 10.0.1.1 update-source Loopback 0
    neighbor 172.16.46.4 remote-as 200
    address-family ipv4 unicast
    no neighbor 10.0.1.1 activate
    no neighbor 172.16.46.4 activate
    address-family ipv4 mdt
    neighbor 172.16.46.4 activate
    neighbor 10.0.1.1 activate
    neighbor 10.0.1.1 next-hop-self
    address-family ipv4 multicast
    neighbor 172.16.46.4 activate
    neighbor 10.0.1.1 activate
    neighbor 10.0.1.1 next-hop-self

    Pay attention to the following. Firstly, R6 is set for PIM SSM and RPF proxy vector support. Secondly, R6 sets itself as the BGP next hop in the updates sent under multicast and MDF SAFI families. This is needed for proper MDT tree construction and correct RPF vector insertion. The next router, R4, is the combined VPN4 and Multicast ASBR for AS 200. It performs the same functions that R3 and R6 perform separately for AS 100. The VPNv4, MDT SAFI, and Multicast address families are enabled under BGP process for this router. At the same time, the router support RPF Proxy Vector and PIM-SSM for proper multicast forwarding. This router is the most configuration-intensive of all routers in both Autonomous Systems, as it also has to support MPLS label propagation via BGP and LDP. Of course, as a classic Option B ASBR, R4 has to change the BGP next-hop to itself for all address families updates sent to R5 – the PE in AS 200.

    R4:
    hostname R4
    !
    no ip domain-lookup
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shut
    !
    ip pim ssm default
    ip multicast rpf proxy vector
    ip multicast-routing
    !
    interface Serial 2/0.34 point-to-point
    ip address 172.16.34.4 255.255.255.0
    frame-relay interface-dlci 403
    mpls ip
    !
    interface Serial 2/0.45 point-to-point
    ip address 20.0.45.4 255.255.255.0
    frame-relay interface-dlci 405
    mpls ip
    ip pim sparse-mode
    !
    interface Serial 2/0.46 point-to-point
    ip address 172.16.46.4 255.255.255.0
    frame-relay interface-dlci 406
    ip pim sparse-mode
    !
    interface Loopback0
    ip address 20.0.4.4 255.255.255.255
    !
    router ospf 1
    network 20.0.4.4 0.0.0.0 area 0
    network 20.0.45.4 0.0.0.0 area 0
    !
    router bgp 200
    no bgp default route-target filter
    neighbor 172.16.34.3 remote-as 100
    neighbor 172.16.46.6 remote-as 100
    neighbor 20.0.5.5 remote-as 200
    neighbor 20.0.5.5 update-source Loopback0
    address-family ipv4 unicast
    no neighbor 172.16.34.3 activate
    no neighbor 20.0.5.5 activate
    no neighbor 172.16.46.6 activate
    address-family vpnv4 unicast
    neighbor 172.16.34.3 activate
    neighbor 172.16.34.3 send-community both
    neighbor 20.0.5.5 activate
    neighbor 20.0.5.5 send-community both
    neighbor 20.0.5.5 next-hop-self
    address-family ipv4 mdt
    neighbor 172.16.46.6 activate
    neighbor 20.0.5.5 activate
    neighbor 20.0.5.5 next-hop-self
    address-family ipv4 multicast
    neighbor 20.0.5.5 activate
    neighbor 20.0.5.5 next-hop-self
    neighbor 172.16.46.6 activate

    The last router in the diagram is R5. It’s a PE in AS 200 configured symmetrically to R1. It has to support VPNv4, MDT SAFI and Multicast address families to learn all necessary information from the ASBR. Of course, PIM RPF proxy vector is enabled for VRF RED’s MDT as well as PIM-SSM is configured for the default group range in the global routing table. There is no router in AS 200 that emulates BGP free core, as you may have noticed.

    R5:
    hostname R5
    !
    no ip domain-lookup
    !
    interface Serial 2/0
    encapsulation frame-relay
    no shut
    !
    ip multicast-routing
    !
    interface Serial 2/0.45 point-to-point
    ip address 20.0.45.5 255.255.255.0
    frame-relay interface-dlci 504
    mpls ip
    ip pim sparse-mode
    !
    interface Loopback0
    ip pim sparse-mode
    ip address 20.0.5.5 255.255.255.255
    !
    router ospf 1
    network 20.0.5.5 0.0.0.0 area 0
    network 20.0.45.5 0.0.0.0 area 0
    !
    ip vrf RED
    rd 200:1
    route-target both 200:1
    route-target both 100:1
    mdt default 232.1.1.1
    !
    router bgp 200
    neighbor 20.0.4.4 remote-as 200
    neighbor 20.0.4.4 update-source Loopback0
    address-family ipv4 unicast
    no neighbor 20.0.4.4 activate
    address-family vpnv4 unicast
    neighbor 20.0.4.4 activate
    neighbor 20.0.4.4 send-community both
    address-family ipv4 mdt
    neighbor 20.0.4.4 activate
    address-family ipv4 multicast
    neighbor 20.0.4.4 activate
    network 20.0.5.5 mask 255.255.255.255
    address-family ipv4 vrf RED
    redistribute connected
    !
    ip multicast vrf RED rpf proxy rd vector
    ip pim ssm default
    !
    ip multicast-routing vrf RED
    !
    interface FastEthernet 0/0
    ip vrf forwarding RED
    ip address 192.168.5.1 255.255.255.0
    ip pim dense-mode
    no shutdown

    Validating Unicast Paths

    This is the simplest part. Use the show commands to see if the VPNv4 prefixes have propagated between the PEs and test end-to-end connectivity:

    R1#sh ip route vrf RED
    

    Routing Table: RED
    Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
    D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
    N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
    E1 - OSPF external type 1, E2 - OSPF external type 2
    i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
    ia - IS-IS inter area, * - candidate default, U - per-user static route
    o - ODR, P - periodic downloaded static route

    Gateway of last resort is not set

    B 192.168.5.0/24 [200/0] via 10.0.3.3, 00:50:35
    C 192.168.1.0/24 is directly connected, FastEthernet0/0

    R1#show bgp vpnv4 unicast vrf RED
    BGP table version is 5, local router ID is 10.0.1.1
    Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
    r RIB-failure, S Stale
    Origin codes: i - IGP, e - EGP, ? - incomplete

    Network Next Hop Metric LocPrf Weight Path
    Route Distinguisher: 100:1 (default for vrf RED)
    *> 192.168.1.0 0.0.0.0 0 32768 ?
    *>i192.168.5.0 10.0.3.3 0 100 0 200 ?

    R1#show bgp vpnv4 unicast vrf RED 192.168.5.0
    BGP routing table entry for 100:1:192.168.5.0/24, version 5
    Paths: (1 available, best #1, table RED)
    Not advertised to any peer
    200, imported path from 200:1:192.168.5.0/24
    10.0.3.3 (metric 129) from 10.0.3.3 (10.0.3.3)
    Origin incomplete, metric 0, localpref 100, valid, internal, best
    Extended Community: RT:100:1 RT:200:1
    Connector Attribute: count=1
    type 1 len 12 value 200:1:20.0.5.5
    mpls labels in/out nolabel/23

    R1#traceroute vrf RED 192.168.5.1

    Type escape sequence to abort.
    Tracing the route to 192.168.5.1

    1 10.0.12.2 [MPLS: Labels 17/23 Exp 0] 432 msec 36 msec 60 msec
    2 10.0.23.3 [MPLS: Label 23 Exp 0] 68 msec 8 msec 36 msec
    3 172.16.34.4 [MPLS: Label 19 Exp 0] 64 msec 16 msec 48 msec
    4 192.168.5.1 12 msec * 8 msec

    Notice that in the output above, the prefix 192.168.5.0/24 has the next-hop value of 10.0.3.3 and the BGP Connector attribute value of 200:1:20.0.5.5. This information will be used for RPF checks further when we start feeding multicast traffic.

    Validating Multicast Paths

    Multicast forwarding is a bit more complicated. The first thing we should do is making sure the MDTs have been built from R1 towards R5 and from R5 towards R1. Check the PIM MDT groups on every PE:

    R1#show ip pim mdt 
    MDT Group Interface Source VRF
    * 232.1.1.1 Tunnel0 Loopback0 RED
    R1#show ip pim mdt bgp
    MDT (Route Distinguisher + IPv4) Router ID Next Hop
    MDT group 232.1.1.1
    200:1:20.0.5.5 10.0.6.6 10.0.6.6

    R5#show ip pim mdt
    MDT Group Interface Source VRF
    * 232.1.1.1 Tunnel0 Loopback0 RED

    R5#show ip pim mdt bgp
    MDT (Route Distinguisher + IPv4) Router ID Next Hop
    MDT group 232.1.1.1
    100:1:10.0.1.1 20.0.4.4 20.0.4.4

    In the output above, pay attention to the next-hop values found in the MDT BGP information. In AS 100 it points toward R6 while in AS 200 it points to R4. Those next-hops are to be used as the proxy vectors for PIM Join messages. Check the mroutes for the tree (20.0.5.5, 232.1.1.1) starting from R1 and climbing up across R2, R6, R4 to R5:

    R1#show ip mroute 232.1.1.1
    IP Multicast Routing Table
    Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
    L - Local, P - Pruned, R - RP-bit set, F - Register flag,
    T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
    X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
    U - URD, I - Received Source Specific Host Report,
    Z - Multicast Tunnel, z - MDT-data group sender,
    Y - Joined MDT-data group, y - Sending to MDT-data group,
    V - RD & Vector, v - Vector
    Outgoing interface flags: H - Hardware switched, A - Assert winner
    Timers: Uptime/Expires
    Interface state: Interface, Next-Hop or VCD, State/Mode

    (20.0.5.5, 232.1.1.1), 00:58:49/00:02:59, flags: sTIZV
    Incoming interface: Serial2/0.12, RPF nbr 10.0.12.2, vector 10.0.6.6
    Outgoing interface list:
    MVRF RED, Forward/Sparse, 00:58:49/00:01:17

    (10.0.1.1, 232.1.1.1), 00:58:49/00:03:19, flags: sT
    Incoming interface: Loopback0, RPF nbr 0.0.0.0
    Outgoing interface list:
    Serial2/0.12, Forward/Sparse, 00:58:47/00:02:55

    R1#show ip mroute 232.1.1.1 proxy
    (20.0.5.5, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    200:1/10.0.6.6 0.0.0.0 BGP MDT 00:58:51/stopped

    R1 shows the RPF proxy value of 10.0.6.6 for the source 20.0.5.5. Notice that there is a tree toward 10.0.1.1, which has been originated from R5. This tree has no proxies, as its now inside its native AS. Next in turn, check R2 and R6 to find the same information (remember that the actual proxy removes the vector when it sees itself in the PIM Join message proxy field):

    R2#show ip mroute 232.1.1.1
    IP Multicast Routing Table
    Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
    L - Local, P - Pruned, R - RP-bit set, F - Register flag,
    T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
    X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
    U - URD, I - Received Source Specific Host Report,
    Z - Multicast Tunnel, z - MDT-data group sender,
    Y - Joined MDT-data group, y - Sending to MDT-data group,
    V - RD & Vector, v - Vector
    Outgoing interface flags: H - Hardware switched, A - Assert winner
    Timers: Uptime/Expires
    Interface state: Interface, Next-Hop or VCD, State/Mode

    (10.0.1.1, 232.1.1.1), 01:01:41/00:03:25, flags: sT
    Incoming interface: Serial2/0.12, RPF nbr 10.0.12.1
    Outgoing interface list:
    Serial2/0.26, Forward/Sparse, 01:01:41/00:02:50

    (20.0.5.5, 232.1.1.1), 01:01:43/00:03:25, flags: sTV
    Incoming interface: Serial2/0.26, RPF nbr 10.0.26.6, vector 10.0.6.6
    Outgoing interface list:
    Serial2/0.12, Forward/Sparse, 01:01:43/00:02:56

    R2#show ip mroute 232.1.1.1 proxy
    (20.0.5.5, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    200:1/10.0.6.6 10.0.12.1 PIM 01:01:46/00:02:23

    Notice the same proxy vector for (20.0.5.5, 232.1.1.1) set by R1. As expected, there is “contra-directional” tree built toward R1 from R5, that has no RPF proxy vector. Proceed to the outputs from R6. Notice that R6 knows of two proxy vectors, one of which is R6 itself.

    R6#show ip mroute 232.1.1.1
    IP Multicast Routing Table
    Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
    L - Local, P - Pruned, R - RP-bit set, F - Register flag,
    T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
    X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
    U - URD, I - Received Source Specific Host Report,
    Z - Multicast Tunnel, z - MDT-data group sender,
    Y - Joined MDT-data group, y - Sending to MDT-data group,
    V - RD & Vector, v - Vector
    Outgoing interface flags: H - Hardware switched, A - Assert winner
    Timers: Uptime/Expires
    Interface state: Interface, Next-Hop or VCD, State/Mode

    (10.0.1.1, 232.1.1.1), 01:05:40/00:03:21, flags: sT
    Incoming interface: Serial2/0.26, RPF nbr 10.0.26.2
    Outgoing interface list:
    Serial2/0.46, Forward/Sparse, 01:05:40/00:02:56

    (20.0.5.5, 232.1.1.1), 01:05:42/00:03:21, flags: sTV
    Incoming interface: Serial2/0.46, RPF nbr 172.16.46.4, vector 172.16.46.4
    Outgoing interface list:
    Serial2/0.26, Forward/Sparse, 01:05:42/00:02:51

    R6#show ip mroute proxy
    (10.0.1.1, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    100:1/local 172.16.46.4 PIM 01:05:44/00:02:21

    (20.0.5.5, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    200:1/local 10.0.26.2 PIM 01:05:47/00:02:17

    The show commands outputs from R4 are similar to R6’s – it’s the proxy for both multicast trees:

    R4#show ip mroute 232.1.1.1
    IP Multicast Routing Table
    Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
    L - Local, P - Pruned, R - RP-bit set, F - Register flag,
    T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
    X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
    U - URD, I - Received Source Specific Host Report,
    Z - Multicast Tunnel, z - MDT-data group sender,
    Y - Joined MDT-data group, y - Sending to MDT-data group,
    V - RD & Vector, v - Vector
    Outgoing interface flags: H - Hardware switched, A - Assert winner
    Timers: Uptime/Expires
    Interface state: Interface, Next-Hop or VCD, State/Mode

    (10.0.1.1, 232.1.1.1), 01:08:42/00:03:16, flags: sTV
    Incoming interface: Serial2/0.46, RPF nbr 172.16.46.6, vector 172.16.46.6
    Outgoing interface list:
    Serial2/0.45, Forward/Sparse, 01:08:42/00:02:51

    (20.0.5.5, 232.1.1.1), 01:08:44/00:03:16, flags: sT
    Incoming interface: Serial2/0.45, RPF nbr 20.0.45.5
    Outgoing interface list:
    Serial2/0.46, Forward/Sparse, 01:08:44/00:02:46

    R4#show ip mroute proxy
    (10.0.1.1, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    100:1/local 20.0.45.5 PIM 01:08:46/00:02:17

    (20.0.5.5, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    200:1/local 172.16.46.6 PIM 01:08:48/00:02:12

    Finally, the outputs on R5 mirror the ones we saw on R1. However, this time the multicast trees have swapped in their roles: the one toward R1 has the proxy vector set:

    R5#show ip mroute 232.1.1.1
    IP Multicast Routing Table
    Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
    L - Local, P - Pruned, R - RP-bit set, F - Register flag,
    T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
    X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
    U - URD, I - Received Source Specific Host Report,
    Z - Multicast Tunnel, z - MDT-data group sender,
    Y - Joined MDT-data group, y - Sending to MDT-data group,
    V - RD & Vector, v - Vector
    Outgoing interface flags: H - Hardware switched, A - Assert winner
    Timers: Uptime/Expires
    Interface state: Interface, Next-Hop or VCD, State/Mode

    (10.0.1.1, 232.1.1.1), 01:12:07/00:02:57, flags: sTIZV
    Incoming interface: Serial2/0.45, RPF nbr 20.0.45.4, vector 20.0.4.4
    Outgoing interface list:
    MVRF RED, Forward/Sparse, 01:12:07/00:00:02

    (20.0.5.5, 232.1.1.1), 01:13:40/00:03:27, flags: sT
    Incoming interface: Loopback0, RPF nbr 0.0.0.0
    Outgoing interface list:
    Serial2/0.45, Forward/Sparse, 01:12:09/00:03:12

    R5#show ip mroute proxy
    (10.0.1.1, 232.1.1.1)
    Proxy Assigner Origin Uptime/Expire
    100:1/20.0.4.4 0.0.0.0 BGP MDT 01:12:10/stopped

    R5’s BGP table also has the connector attribute information for R1’s Ethernet interface:

    R5#show bgp vpnv4 unicast vrf RED 192.168.1.0
    BGP routing table entry for 200:1:192.168.1.0/24, version 6
    Paths: (1 available, best #1, table RED)
    Not advertised to any peer
    100, imported path from 100:1:192.168.1.0/24
    20.0.4.4 (metric 65) from 20.0.4.4 (20.0.4.4)
    Origin incomplete, metric 0, localpref 100, valid, internal, best
    Extended Community: RT:100:1 RT:200:1
    Connector Attribute: count=1
    type 1 len 12 value 100:1:10.0.1.1
    mpls labels in/out nolabel/18

    In addition to the connector attributes, one last piece of information needed is the multicast source information propagated via IPv4 multicast address family. Both R1 and R5 should have this information in their BGP tables:

    R5#show bgp ipv4 multicast
    BGP table version is 3, local router ID is 20.0.5.5
    Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
    r RIB-failure, S Stale
    Origin codes: i - IGP, e - EGP, ? - incomplete

    Network Next Hop Metric LocPrf Weight Path
    *>i10.0.1.1/32 20.0.4.4 0 100 0 100 i
    *> 20.0.5.5/32 0.0.0.0 0 32768 i

    R1#show bgp ipv4 multicast
    BGP table version is 3, local router ID is 10.0.1.1
    Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
    r RIB-failure, S Stale
    Origin codes: i - IGP, e - EGP, ? - incomplete

    Network Next Hop Metric LocPrf Weight Path
    *> 10.0.1.1/32 0.0.0.0 0 32768 i
    *>i20.0.5.5/32 10.0.6.6 0 100 0 200 i

    Now it’s time to verify the multicast connectivity. Make sure R1 and R5 see each other as PIM neighbors over the MDT and than do a multicast ping toward the group joined by both R1 and R5:

    R1#show ip pim vrf RED neighbor 
    PIM Neighbor Table
    Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
    S - State Refresh Capable
    Neighbor Interface Uptime/Expires Ver DR
    Address Prio/Mode
    20.0.5.5 Tunnel0 01:31:38/00:01:15 v2 1 / DR S P

    R5#show ip pim vrf RED neighbor
    PIM Neighbor Table
    Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
    S - State Refresh Capable
    Neighbor Interface Uptime/Expires Ver DR
    Address Prio/Mode
    10.0.1.1 Tunnel0 01:31:17/00:01:26 v2 1 / S P

    R1#ping vrf RED 239.1.1.1 repeat 100

    Type escape sequence to abort.
    Sending 100, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

    Reply to request 0 from 192.168.1.1, 12 ms
    Reply to request 0 from 20.0.5.5, 64 ms
    Reply to request 1 from 192.168.1.1, 8 ms
    Reply to request 1 from 20.0.5.5, 56 ms
    Reply to request 2 from 192.168.1.1, 8 ms
    Reply to request 2 from 20.0.5.5, 100 ms
    Reply to request 3 from 192.168.1.1, 16 ms
    Reply to request 3 from 20.0.5.5, 56 ms

    This final verification concludes our testbed verification.

    Summary

    In this blog post we demonstrated how MP-BGP and PIM extensions could be used to effectively implement Inter-AS multicast VPN between the autonomous systems with BGP-free cores. PIM SSM is used to build the inter-AS trees and MDT SAFI is used to discover the MDT group addresses along with the PEs associated with those. PIM RPF proxy vector allows for successful RPF checks in the multicast-route free core, by the virtue of proxy IPv4 address. Finally, BGP connector attribute allows for successful RPF checks inside a particular VRF.

    Further Reading

    Multicast VPN (draft-rosen-vpn-mcast)
    Multicast Tunnel Discovery (draft-wijnands-mt-discovery)
    PIM RPF Vector (draft-ietf-pim-rpf-vector-08)
    MDT SAFI (draft-nalawade-idr-mdt-safi)

    MDT SAFI Configuration
    PIM RPF Proxy Vector Configuration

    Hey! Don’t miss anything - subscribe to our newsletter!

    © 2022 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
    instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo