Mar
08

Introduction

This blog post provides example M-VPN configuration using out-of-band mapping of C-multicast groups to core M-LSP tunnels signaled via M-LDP. A reader is assumed to have solid knowledge of M-VPNs and understanding of M-LDP. For people unfamiliar with M-LDP, the following blog post provides some introductory reading: The long road to M-LSPs. The terminology used in this article follows the common convention of using the "P-" prefix for provider specific objects (e.g. multicas routes or tunnels) and using the "C-" prefix for the customer objects (e.g. private IP addresses). Before we begin, I would like to thank our reader Hans Verkerk who pointed me the fact that Cisco IOS does support M-LDP in the latest 12.2SR images.

Interworking M-LDP with PIM: In-Band and Out-of-Band

To recap, M-LDP allows for construction of P2MP (point-to-multipoint) and MP2MP (multipoint-to-multipoint) LSPs. Every LSP is identified by a tuple made of the root node IP address X and an opaque value Y, shared by all leaves. MP2MP LSP is identified by a shared root IP address and an opaque value and consists of downstream and upstream sections. The downstream part is a P2MP LSP rooted at the shared node and the upstream part is a MP2P (multipoint to point) LSP built starting from the root node and to the leaves to allow the latter sending traffic upstream to the root. It is important to note that the components of MP2P LSP are classic unicast LSPs connecting each and every leaf to the root of MP2MP LSP. Look at the diagram below for illustration of MP2MP LSP signaling.

mld-signaled-mvpns-mp2mp-lsp

One of the most prominent applications of M-LSPs is effective multicast traffic encapsulation in MPLS networks. Since M-LSPs are signaled using M-LDP and multicast trees use PIM the problem is mapping multicast trees to M-LSPs. It is intuitively obvious that shortest path trees could be mapped to P2MP LSPs and shared trees correspond to MP2MP LSPs. There are two approaches to implement such mapping: the first uses in-band signaling, where PIM messages are directly translated into M-LDP FEC bindings, and the second approach uses out-of-band mapping, e.g. based on manual configuration.

PIM/M-LDP in-band signaling could be useful to transit multicast traffic across MPLS enabled cores in optimal manner. There is no need to enable PIM and multicast routing in the MPLS core – the IP2MPLS edge devices will translate PIM Join messages into M-LDP FEC bindings and encode the group/source information in opaque values to let the other edge device convert M-LDP FEC binding into PIM Joins. The work is currently in progress to standardize the in-band signaling in the following IETF Drafts: “draft-wijnands-mpls-mldp-in-band-signaling” and “draft-rekhter-pim-sm-over-mldp”. In-band signaling could also be used for M-VPN implementation by directly translating all customer PIM Joins into core P2MP groups. This approach is also known as “Direct MDT” and its benefit is the fact that no customer PIM adjacencies need to be running across the SP tunnel interface. However, the massive drawback is uncontrolled forwarding state growth in the SP core, which seriously limits scalability of this approach. The diagram below illustrates direct MDT model signaling flow: customer PIM messages are translated into core M-LDP bindings.

mld-signaled-mvpns-inband-signaling

The out-of-band approach is well known by Rosen’s MVPN implementation, where M-VPN endpoints are discovered by means of MP-BGP extensions and MDTs are mapped to SP core multicast groups (P-tunnels) by means of manual configuration. With some modifications, this approach could be adapted to use M-LSPs as transport (Provider or P-Tunnels) instead of mGRE. The default MDT could be instantiated as MP2MP LSP connecting all PEs participating in a particular M-VPN. The data MDTs could be dynamically created on demand by mapping selected multicast flows to more optimal P-tunnels instantiated as P2MP LSPs. You may find the detailed description of updated Rosen’s M-VPNs in “draft-ietf-l3vpn-2547bis-mcast”. This document outlines the use of different P-tunnel transportation and signaling techniques, including native multicast mGRE, RSVP-TE signaled M-LSPs, M-LDP signaled M-LSPs and finally the full mesh of unicast tunnels connecting the PE routers. Notice that the newer draft refers to the “default MDT” as “Inclusive tunnel” and “data MDT” as “Selective Tunnel” – those is more generic terms than “MDTs”.

Even though with respect to M-VPNs the out-of-band approach is much more scalable than in-band, it still has a limitation rooted in the fact that PIM is required to signal customer multicast groups. This results in full-mesh of overlay C-PIM adjacencies between the PE routers, one mesh for every M-VPN implemented. In addition to this, PIM being a soft-state protocol periodically re-signals all trees, which puts additional burden on the PE-routers. A work is in progress to reduce the amount of PIM refresh messages, known as “refresh reduction”. Meanwhile, M-LDP could be used in Carrier-Supporting-Carrier (CsC) manner, where customer trees are instantiated as nested MPLS LSPs, which allows the SP to push the C-PIM signaling off the PE routers and down to the customer equipment. A special extension to M-LDP is being standardized for this purpose in the draft “draft-wijnands-mpls-mldp-csc”.

RSVP-TE and M-LDP Signaled M-LSPs

Traffic engineering is the biggest advantage of using MPLS LSPs in place of any other transport. The static, source-signaled RSVP-TE based P2MP LSPs discussed in the previous blog post have the ability of using Traffic-Engineering database for optimal resource utilization. However, M-LDP based M-LSPs are dynamically signaled in ordered downstream manner, thus disallowing resource pre-calculation and reservation. However, the use of M-LSPs instead of mGRE allows for effective traffic protection. The approach to this is simple recursive routing: if a given upstream node detects that the best path to downstream node goes across a MPLS tunnel, the corresponding M-LSP branch is re-routed across the tunnel using an additional tunnel label needed. Combined with one-hop primary/backup tunnels this allows for effective multicast traffic protection against link and node failures.

Using M-LSPs as transport for Rosen’s MVPNs

It does not look like Cisco currently implements the “in-band” signaling mechanism in IOS 12.2SR. However, the Out-of-band mapping could be used to replace mGRE with M-LDP signaled M-LSPs. The configuration is straight-forward, and requires the following steps

  • Disabling core SP multicast if you don’t need it for other applications. M-LDP will be used for M-LSP construction. Of course, you may keep native multicast services if you still need them.
  • Configuring all VRFs with a VPN Identifier. VPN ID is standardized in RFC 2685 and serves the purpose of identifying the same VPN in multiple PEs, possibly spanning multiple ASs. Unlike RDs and RTs this value is the same among all sites and could be used by some protocols suchs RADIUS or DHCP to allocate the proper configuration information. M-LDP uses the VPN-ID for the construction of the opaque value shared by all PEs for the given M-VPN. The VPN-ID replaces the use of shared SP core multicast group that was needed for mGRE based P-tunnels. The format for the VPN-ID is “OUI:Unique-ID” where both values are in hex. You may simply choose the value unique for every VPN.
  • Specifying the Default MDT Root IP address. This is necessary in order to create the shared MP2MP LSP known to all PEs for the particular M-VPN. This LSP is used to establish C-PIM adjacencies and forward all multicast traffic by default. Notice that a single root could be used by multiple M-VPNs as long as they use different VPN-IDs. In real-world deployments, the root node placement is important as there is by default only one P-Tunnel used for all M-VPN traffic. This results in additional load on the MP2MP root as all M-VPN traffic is supposed to transit this router. You may compare the MP2MP root with the PIM RP router, with the exception to the fact that you may manually select it for every M-VPN.
  • Configuring the number of data MDTs available for a given VRF. This value if specific per VRF per router. Just like classic (multicast mGRE) data MDTs, you configure the traffic threshold (by default there is no switchover) that needs to be crossed for the PE to signal a separate P2MP LSPs for the exceeding multicast flows. You don’t have to specify the opaque value similar to the multicast-group range needed for mGRE VPNs. The same opaque value will be reused, however the data MDT LSPs will be rooted in the BGP Next-Hop IP address corresponding to the multicast C-source IP address. Notice that multiple data MDTs built by different PEs toward the same root will be effectively merged in the core due to the fact that they share the same VPN-ID. This significantly improves resource utilization.

The above configuration guidelines imply that there is no currently support for Inter-AS M-LDP based M-VPNs as M-LDP uses BGP next-hop IP address to construct P2MP trees and does not support any “proxy” elements. The updated draft “draft-ietf-l3vpn-2547bis-mcast” additionally specifies the use of segmented P-Tunnels as an alternate to the “spanning” inter-AS P-tunnels used by mGRE-based M-VPNs. However, this feature does not seems to be yet implemented in IOS, as well as the additional MP-BGP extensions required for the new, transport-independent M-VPNs.

Practical Example

The figure below displays the topology used by the sample scenario. The three routers: R1, R2 and R3 form multicast-free SP core that uses M-LDP in the core and one-hop automatic backup tunnels for traffic protection. The CE1 and CE2 routers in the customer domain use PIM SSM for multicast signaling, and CE2 joins group 232.1.1.1 sourced at CE1 using IGMPv3. We are going to review and verify this scenario:

mld-signaled-mvpns-practical-example

The following are the configurations for R1, R2 and R3 – the PE and P routers.

R1:
hostname R1
!
interface Loopback 0
ip address 10.1.1.1 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 102
ip address 10.1.12.1 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.13 point
frame-relay interface-dlci 103
ip address 10.1.13.1 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
mpls traffic-eng tunnels
mpls traffic-eng auto-tunnel backup nhop-only
mpls traffic-eng auto-tunnel primary onehop
!
ip routing protocol purge interface
!
router ospf 1
network 0.0.0.0 0.0.0.0 area 0
mpls ldp autoconfig area 0
mpls traffic-eng area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng multicast-intact
!
mpls mldp
mpls mldp path traffic-eng
!
ip vrf TEST
rd 100:1
vpn id 100:1
route-target export 100:1
route-target import 100:1
mdt preference mldp
mdt default mpls mldp 10.1.3.3
mdt data mpls mldp 10
mdt data threshold 1
!
ip multicast-routing vrf TEST
ip pim vrf TEST ssm default

!
interface FastEthernet 0/0
ip vrf forwarding TEST
ip address 172.16.1.1 255.255.255.0
ip pim sparse-mode
no shutdown
!
router bgp 100
bgp router-id 10.1.1.1
neighbor 10.1.2.2 remote-as 100
neighbor 10.1.2.2 update-source Loopback0
address-family ipv4 unicast
no neighbor 10.1.2.2 activate
address-family vpnv4 unicast
neighbor 10.1.2.2 activate
address-family ipv4 vrf TEST
redistribute connected

The above configuration enable LDP and RSVP-TE in the core, automatically configuring one-hop primary and backup tunnels. Notice that OSPF process is setup for LDP auto-configuration. A VRF is created in the router and R1 is set to peer with R2 using VPN-IPv4 address family exchange. There is no multicast routing enabled on the P-interfaces, only the VRF is set up for multicast routing using PIM-SSM. Pay close attention to the VRF configuration:

mpls mldp
mpls mldp path traffic-eng
!
ip vrf TEST
rd 100:1
vpn id 100:1
route-target export 100:1
route-target import 100:1
mdt preference mldp
mdt default mpls mldp 10.1.3.3
mdt data mpls mldp 10
mdt data threshold 1

The M-LDP configuration creates single default MPLS-based MDT using R3's Loopback0 as the root node. The opaque value for this MP2MP LSP is constructed based on the VPN ID value of 100:1. Additionally, data MDTs are to be triggered when the traffic flow exceeds 1 Kbps. The limit of Data MDT is 10 per the VRF. Pay attention to the fact that M-LDP is configured to use the traffic-engineering tunnels for M-LSP forwarding using the commands mpls mldp path traffic-eng to allow the upstream nodes using traffic-engineering tunnels for one-hop M-LSP segments.

The configuration for R2 is almost identical to R1:

R2:
hostname R2
!
interface Loopback 0
ip address 10.1.2.2 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 201
ip address 10.1.12.2 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.23 point
frame-relay interface-dlci 203
ip address 10.1.23.1 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
mpls traffic-eng tunnels
mpls traffic-eng auto-tunnel backup nhop-only
mpls traffic-eng auto-tunnel primary onehop
!
ip routing protocol purge interface
!
router ospf 1
network 0.0.0.0 0.0.0.0 area 0
mpls ldp autoconfig area 0
mpls traffic-eng area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng multicast-intact
!
mpls mldp
mpls mldp path traffic-eng
!
interface FastEthernet 0/0
ip vrf forwarding TEST
ip address 172.16.2.1 255.255.255.0
ip pim sparse-mode
no shutdown
!
router bgp 100
bgp router-id 10.1.2.2
neighbor 10.1.1.1 remote-as 100
neighbor 10.1.1.1 update-source Loopback0
address-family ipv4 unicast
no neighbor 10.1.1.1 activate
address-family vpnv4 unicast
neighbor 10.1.1.1 activate
address-family ipv4 vrf TEST
redistribute connected

And lastly, R3 is configured as a pure P-router. All three routers are configured for one-hop primary and backup tunnels, which allows for complete link protection – every single link failure in this topology is protected by Fast-Reroute. MLDP is configured to use the TE tunnels and thus all multicast traffic is protected as well.

R3:
hostname R3
!
interface Loopback 0
ip address 10.1.3.3 255.255.255.255
!
mpls traffic-eng tunnels
mpls traffic-eng auto-tunnel backup nhop-only
mpls traffic-eng auto-tunnel primary onehop
!
ip routing protocol purge interface
!
router ospf 1
network 0.0.0.0 0.0.0.0 area 0
mpls ldp autoconfig area 0
mpls traffic-eng area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng multicast-intact
!
mpls mldp
mpls mldp path traffic-eng
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.13 point
frame-relay interface-dlci 301
ip address 10.1.13.3 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.23 point
frame-relay interface-dlci 302
ip address 10.1.23.3 255.255.255.0
no ip pim sparse-mode
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000

Practical Scenario Verification

Start by checking M-LDP peering and ensuring there is no multicast running in the network core:

R3#show ip pim neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
P - Proxy Capable, S - State Refresh Capable, G - GenID Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode

R1#show mpls ldp neighbor
Peer LDP Ident: 10.1.3.3:0; Local LDP Ident 10.1.1.1:0
TCP connection: 10.1.3.3.56698 - 10.1.1.1.646
State: Oper; Msgs sent/rcvd: 272/270; Downstream
Up time: 03:49:31
LDP discovery sources:
Serial2/0.13, Src IP addr: 10.1.13.3
Targeted Hello 10.1.1.1 -> 10.1.3.3, active, passive
Addresses bound to peer LDP Ident:
10.1.3.3 10.1.13.3 10.1.23.3
Peer LDP Ident: 10.1.2.2:0; Local LDP Ident 10.1.1.1:0
TCP connection: 10.1.2.2.31782 - 10.1.1.1.646
State: Oper; Msgs sent/rcvd: 270/276; Downstream
Up time: 03:49:26
LDP discovery sources:
Serial2/0.12, Src IP addr: 10.1.12.2
Targeted Hello 10.1.1.1 -> 10.1.2.2, active, passive
Addresses bound to peer LDP Ident:
10.1.2.2 10.1.12.2 10.1.23.1

R1#show mpls mldp neighbor

MLDP peer ID : 10.1.3.3:0, uptime 03:49:36 Up,
Target Adj : Yes
Session hndl : 3
Upstream count : 1
Branch count : 0
Path count : 2
Path(s) : 10.1.3.3 No LDP Tunnel65337
: 10.1.13.3 LDP Serial2/0.13
Nhop count : 2
Nhop list : 10.1.3.3 10.1.13.3

MLDP peer ID : 10.1.2.2:0, uptime 03:49:30 Up,
Target Adj : Yes
Session hndl : 4
Upstream count : 0
Branch count : 0
Path count : 2
Path(s) : 10.1.2.2 No LDP Tunnel65336
: 10.1.12.2 LDP Serial2/0.12
Nhop count : 0

Notice the MLDP output, which shows additional paths to every peer via the MPLS TE tunnels that show no LDP enabled on them. This is due to the fact that we enabled the command mpls mldp path traffic-eng. Check the local traffic engineering tunnels in R1 to see that the above listed tunnels are primary one-hop tunnels to R2 and R3 respectively. Pay attention to the fact that these tunnels are locally protected from link failures.

R1#show mpls traffic-eng tunnels summary
Signalling Summary:
LSP Tunnels Process: running
Passive LSP Listener: running
RSVP Process: running
Forwarding: enabled
auto-tunnel:
backup Enabled (2 ), id-range:65436-65535
onehop Enabled (2 ), id-range:65336-65435

mesh Disabled (0 ), id-range:64336-65335

Periodic reoptimization: every 3600 seconds, next in 44 seconds
Periodic FRR Promotion: Not Running
Periodic auto-tunnel:
primary establish scan: every 10 seconds, next in 1 seconds
primary rm active scan: disabled
backup notinuse scan: every 3600 seconds, next in 151 seconds
Periodic auto-bw collection: every 300 seconds, next in 44 seconds
P2P:
Head: 4 interfaces, 4 active signalling attempts, 4 established
8 activations, 4 deactivations
58 failed activations
0 SSO recovery attempts, 0 SSO recovered
Midpoints: 2, Tails: 4

P2MP:
Head: 0 interfaces, 0 active signalling attempts, 0 established
0 sub-LSP activations, 0 sub-LSP deactivations
0 LSP successful activations, 0 LSP deactivations
SSO: Unsupported
Midpoints: 0, Tails: 0

R1#show mpls traffic-eng tunnels tunnel 65336

Name: R1_t65336 (Tunnel65336) Destination: 10.1.2.2
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 1, type explicit __dynamic_tunnel65336 (Basis for Setup, path weight 10)

Config Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)
AutoRoute announce: enabled LockDown: disabled Loadshare: 0 bw-based
auto-bw: disabled
Active Path Option Parameters:
State: explicit path option 1 is active
BandwidthOverride: disabled LockDown: disabled Verbatim: disabled

InLabel : -
OutLabel : Serial2/0.12, implicit-null
Next Hop : 10.1.12.2
FRR OutLabel : Tunnel65436, explicit-null
RSVP Signalling Info:
Src 10.1.1.1, Dst 10.1.2.2, Tun_Id 65336, Tun_Instance 6558
RSVP Path Info:
My Address: 10.1.12.1
Explicit Route: 10.1.12.2 10.1.2.2
Record Route: NONE
Tspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
RSVP Resv Info:
Record Route: 10.1.2.2(0)
Fspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
Shortest Unconstrained Path Info:
Path Weight: 10 (TE)
Explicit Route: 10.1.12.2 10.1.2.2
History:
Tunnel:
Time since created: 3 hours, 57 minutes
Time since path change: 3 hours, 57 minutes
Number of LSP IDs (Tun_Instances) used: 1
Current LSP: [ID: 6558]
Uptime: 3 hours, 57 minutes

R1#show mpls traffic-eng tunnels tunnel 65337

Name: R1_t65337 (Tunnel65337) Destination: 10.1.3.3
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 1, type explicit __dynamic_tunnel65337 (Basis for Setup, path weight 10)

Config Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)
AutoRoute announce: enabled LockDown: disabled Loadshare: 0 bw-based
auto-bw: disabled
Active Path Option Parameters:
State: explicit path option 1 is active
BandwidthOverride: disabled LockDown: disabled Verbatim: disabled

InLabel : -
OutLabel : Serial2/0.13, implicit-null
Next Hop : 10.1.13.3
FRR OutLabel : Tunnel65437, explicit-null
RSVP Signalling Info:
Src 10.1.1.1, Dst 10.1.3.3, Tun_Id 65337, Tun_Instance 2427
RSVP Path Info:
My Address: 10.1.13.1
Explicit Route: 10.1.13.3 10.1.3.3
Record Route: NONE
Tspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
RSVP Resv Info:
Record Route: 10.1.3.3(0)
Fspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
Shortest Unconstrained Path Info:
Path Weight: 10 (TE)
Explicit Route: 10.1.13.3 10.1.3.3
History:
Tunnel:
Time since created: 3 hours, 57 minutes
Time since path change: 3 hours, 57 minutes
Number of LSP IDs (Tun_Instances) used: 1
Current LSP: [ID: 2427]
Uptime: 3 hours, 57 minutes

Next, validate the existence of MP2MP LSP for the VPN configured on R1 and R2. This LSP should be rooted in R3, and both R1 and R2 should have upstream and downstream components of this LSP. The below output shows the following interesting information:

  • The upstream component (U) of the MP2MP LSP uses Label 17 and Tunnel 65337 to reach to the next upstream hop toward the root node. This is how R1 sends information to all other MP2MP participants.
  • The downstream component (D) shows the P2MP label 21 and the next-hop of 10.1.3.3, meaning R3’s Loopback0 is the root of the respective P2MP tree.
  • Replication clients are the nodes downstream to the P2MP LSP that are to receive the multipoint replicated traffic. In our case, the replication client is the Multicast-enabled VRF MVRF, which uses virtual LSPVIF0 interface as the “ingress” multicast interface. This interface is the abstraction used by Cisco IOS to represent the terminating P2MP LSP. All multicast traffic appear to be sourcing out of this interface to the local multicast routing subsystem.
R1#show mpls mldp database
* Indicates MLDP recursive forwarding is enabled

LSM ID : 1 (RNR LSM ID: F1000002) Type: MP2MP Uptime : 04:04:20
FEC Root : 10.1.3.3
Opaque decoded : [mdt 100:1 0]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000000
RNR active LSP : (this entry)
Upstream client(s) :
10.1.3.3:0 [Active]
Expires : Never Path Set ID : B9000001
Out Label (U) : 17 Interface : Tunnel65337
Local Label (D): 21 Next Hop : 10.1.3.3
Replication client(s):
MDT (VRF TEST)
Uptime : 04:04:20 Path Set ID : DD000002
Interface : Lspvif0

You may get output similar to the one above using the same show command on R2.

R2#show mpls mldp database
* Indicates MLDP recursive forwarding is enabled

LSM ID : 43000001 (RNR LSM ID: 21000002) Type: MP2MP Uptime : 04:29:08
FEC Root : 10.1.3.3
Opaque decoded : [mdt 100:1 0]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000000
RNR active LSP : (this entry)
Upstream client(s) :
10.1.3.3:0 [Active]
Expires : Never Path Set ID : 18000001
Out Label (U) : 16 Interface : Tunnel65337*
Local Label (D): 21 Next Hop : 10.1.3.3
Replication client(s):
MDT (VRF TEST)
Uptime : 04:29:08 Path Set ID : 2
Interface : Lspvif0

If you check the MPLS forwarding table on R1, you will notice that the P2MP downstream component label is mapped to the “MDT 100:1” using aggregate lookup operation. This means that the decapsulated packets are to be routed using the respective VRF’s MFIB database.

R1#show mpls forwarding-table 
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
...
21 [T] No Label [mdt 100:1 0][V] 53166 aggregate/TEST
22 Pop Label 10.1.3.3 65436 [1] \
0 Se2/0.12 point2point

[T] Forwarding through a LSP tunnel.
View additional labelling info with the 'detail' option

Look at the root of the MP2MP LSP for the information on the upstream and downstream labels. The output below illustrates that R3 sees two downstream P2MP LSP clients for the opaque value “mdt 100:1” and uses label 21 for both of them. Both downstream clients are reachable via the TE tunnels. Next, there are two “upstream MP2P” labels that are advertised by R3 to R1 and R2 respectively to be used for encapsulation of upstream traffic. R1 and R2 will use those labels to reach the root of MP2MP LSP via the upstream tunnels.

R3#show mpls mldp database 
* Indicates MLDP recursive forwarding is enabled

LSM ID : 10000002 Type: MP2MP Uptime : 05:49:04
FEC Root : 10.1.3.3 (we are the root)
Opaque decoded : [mdt 100:1 0]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000000
Upstream client(s) :
None
Expires : N/A Path Set ID : 88000004
Replication client(s):
10.1.1.1:0
Uptime : 05:49:04 Path Set ID : 5
Out label (D) : 21 Interface : Tunnel65336*
Local label (U): 17 Next Hop : 10.1.1.1
10.1.2.2:0
Uptime : 00:50:43 Path Set ID : D8000007
Out label (D) : 21 Interface : Tunnel65337*
Local label (U): 16 Next Hop : 10.1.2.2

Now that we ensured the MP2MP LSP existence, check for C-PIM adjacencies that should be established using the MP2MP LSP. As you can see, adjacencies are established over the LSPVIF0 interface, which looks as the actual multicast packet source for every C-PIM instance. There is no mGRE MDT tunnel anymore, it is now replaced by the LSPVIF0 interface.

R1#show ip pim vrf TEST neighbor 
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
P - Proxy Capable, S - State Refresh Capable, G - GenID Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
172.16.1.7 FastEthernet0/0 03:53:39/00:01:34 v2 1 / DR S G
10.1.2.2 Lspvif0 04:30:56/00:01:24 v2 1 / DR S P G

R2#show ip pim vrf TEST neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
P - Proxy Capable, S - State Refresh Capable, G - GenID Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
10.1.1.1 Lspvif0 04:31:18/00:01:30 v2 1 / S P G

R1#show ip pim mdt
* implies mdt is the default MDT
MDT Group/Num Interface Source VRF
* 0 Lspvif0 Loopback0 TEST

R2#show ip pim mdt
* implies mdt is the default MDT
MDT Group/Num Interface Source VRF
* 0 Lspvif0 Loopback0 TEST

The final part of the test is configuring one of the CE routers to join a specific multicast source:

R2#show ip igmp vrf TEST groups 
IGMP Connected Group Membership
Group Address Interface Uptime Expires Last Reporter Group Accounted
232.1.1.1 FastEthernet0/0 03:37:04 stopped 172.16.2.7
224.0.1.40 Lspvif0 04:18:01 00:02:24 10.1.2.2

Notice how the C-multicast routing tables look on the PE routers:

R1#show ip mroute vrf TEST
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.7, 232.1.1.1), 01:06:34/00:02:39, flags: sT
Incoming interface: FastEthernet0/0, RPF nbr 172.16.1.7
Outgoing interface list:
Lspvif0, Forward/Sparse, 03:38:23/00:02:39

(*, 224.0.1.40), 04:37:07/00:02:08, RP 0.0.0.0, flags: DPL
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list: Null

R2#show ip mroute vrf TEST
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.7, 232.1.1.1), 03:37:48/00:02:18, flags: sTI
Incoming interface: Lspvif0, RPF nbr 10.1.1.1
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 03:37:48/00:02:18

(*, 224.0.1.40), 04:36:29/00:02:44, RP 0.0.0.0, flags: DCL
Incoming interface: Null, RPF nbr 0.0.0.0
Outgoing interface list:
Lspvif0, Forward/Sparse, 04:18:45/00:02:44

The control plane part looks correct, and now we may test the data-plane connectivity by sourcing ping packets to 232.1.1.1 from CE1. Before we do this, enable packet debugging in R2.

R2#debug mpls packet 
Packet debugging is on

R2#debug ip mfib vrf TEST fs
MFIB IPv4 fs debugging enabled for vrf TEST
MPLS turbo: Se2/0.23: rx: Len 108 Stack {21 0 253} - ipv4 data
MPLS les: Se2/0.23: rx: Len 108 Stack {21 0 253} - ipv4 data
MFIBv4(0x1): Receive (172.16.1.7,232.1.1.1) from Lspvif0 (FS): hlen 5 prot 1 len 100 ttl 252 frag 0x0

The above output signifies that the ICMP packets received arrive in MPLS encapsulation using the downstream label of 21. It is interesting to look at the debugging output on R3, which is the MP2MP root:

R3#
MPLS turbo: Se2/0.13: rx: Len 108 Stack {17 0 254} - ipv4 data
MPLS les: Se2/0.13: rx: Len 108 Stack {17 0 254} - ipv4 data

MPLS les: Tu65337: tx: Len 108 Stack {21 0 253} - ipv4 data
MPLS les: Se2/0.23: tx: Len 108 Stack {21 0 253} - ipv4 data

You can clearly see the MPLS packet received on the upstream LSP with the label of 17 and transmitted downstream on the tunnel interface connecting R3 to R2 using the downstream label of 21. Notice that the outgoing downstream label is not stacked with the tunnel transport label, as the transport label for Tunnel65337 is implicit null:

R3#show mpls traffic-eng tunnels tunnel 65337

Name: R3_t65337 (Tunnel65337) Destination: 10.1.2.2
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 1, type explicit __dynamic_tunnel65337 (Basis for Setup, path weight 10)

Config Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)
AutoRoute announce: enabled LockDown: disabled Loadshare: 0 bw-based
auto-bw: disabled
Active Path Option Parameters:
State: explicit path option 1 is active
BandwidthOverride: disabled LockDown: disabled Verbatim: disabled

InLabel : -
OutLabel : Serial2/0.23, implicit-null
Next Hop : 10.1.23.1

This completes the test of the basic connectivity and MP2MP LSP. The next in turn is testing advanced features, such as traffic protection and data MDT switchover.

Advanced Features: FRR and MDT Switchover

We will test the FRR as it applies to the MP2MP LSP. To accomplish this, disable the link connecting R3 to R2 by removing the DLCI from R3:

R3:
interface Serial 2/0.23
no frame-relay interface-dlci 302

Now oserve the status of MPLS TE tunnels. Notice that the one-hop tunnel connecting R3 to R2 is not re-routed over the backup tunnel going across R1.

R3#show mpls mldp database 
* Indicates MLDP recursive forwarding is enabled

LSM ID : 10000002 Type: MP2MP Uptime : 04:53:19
FEC Root : 10.1.3.3 (we are the root)
Opaque decoded : [mdt 100:1 0]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000000
Upstream client(s) :
None
Expires : N/A Path Set ID : 88000004
Replication client(s):
10.1.1.1:0
Uptime : 04:53:19 Path Set ID : 5
Out label (D) : 21 Interface : Tunnel65336*
Local label (U): 17 Next Hop : 10.1.1.1
10.1.2.2:0
Uptime : 04:53:11 Path Set ID : C7000006
Out label (D) : 21 Interface : Tunnel65337*
Local label (U): 16 Next Hop : 10.1.2.2

R3#show mpls traffic-eng tunnels Tunnel 65337

Name: R3_t65337 (Tunnel65337) Destination: 10.1.2.2
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 1, type explicit __dynamic_tunnel65337 (Basis for Setup, path weight 10)
Change in required resources detected: reroute pending
Currently Signalled Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)

Config Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)
AutoRoute announce: enabled LockDown: disabled Loadshare: 0 bw-based
auto-bw: disabled
Active Path Option Parameters:
State: explicit path option 1 is active
BandwidthOverride: disabled LockDown: disabled Verbatim: disabled

InLabel : -
OutLabel : Serial2/0.23, implicit-null
Next Hop : 10.1.23.1
FRR OutLabel : Tunnel65436, explicit-null (in use)
RSVP Signalling Info:
Src 10.1.3.3, Dst 10.1.2.2, Tun_Id 65337, Tun_Instance 6593
RSVP Path Info:
My Address: 10.1.23.3
Explicit Route: 10.1.2.2 10.1.2.2
Record Route: NONE
Tspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
RSVP Resv Info:
Record Route: 10.1.2.2(0)
Fspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
Shortest Unconstrained Path Info:
Path Weight: 20 (TE)
Explicit Route: 10.1.13.1 10.1.12.2 10.1.2.2
History:
Tunnel:
Time since created: 1 hours, 27 minutes
Time since path change: 1 hours, 27 minutes
Number of LSP IDs (Tun_Instances) used: 4
Current LSP: [ID: 6593]
Uptime: 1 hours, 27 minutes

R3#show mpls traffic-eng tunnels Tunnel 65436

Name: R3_t65436 (Tunnel65436) Destination: 10.1.2.2
Status:
Admin: up Oper: up Path: valid Signalling: connected
path option 1, type explicit __dynamic_tunnel65436 (Basis for Setup, path weight 20)

Config Parameters:
Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)
AutoRoute announce: disabled LockDown: disabled Loadshare: 0 bw-based
auto-bw: disabled
Active Path Option Parameters:
State: explicit path option 1 is active
BandwidthOverride: disabled LockDown: disabled Verbatim: disabled

InLabel : -
OutLabel : Serial2/0.13, 22
Next Hop : 10.1.13.1
RSVP Signalling Info:
Src 10.1.3.3, Dst 10.1.2.2, Tun_Id 65436, Tun_Instance 1
RSVP Path Info:
My Address: 10.1.13.3
Explicit Route: 10.1.13.1 10.1.12.2 10.1.2.2
Record Route: NONE
Tspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
RSVP Resv Info:
Record Route: NONE
Fspec: ave rate=0 kbits, burst=1000 bytes, peak rate=0 kbits
Shortest Unconstrained Path Info:
Path Weight: 20 (TE)
Explicit Route: 10.1.13.1 10.1.12.2 10.1.2.2
History:
Tunnel:
Time since created: 1 hours, 27 minutes
Time since path change: 1 hours, 27 minutes
Number of LSP IDs (Tun_Instances) used: 1
Current LSP: [ID: 1]
Uptime: 1 hours, 27 minutes

Return things back to normal on R3 (for simplicity) and attempt a flood ping off CE1 to the group 232.1.1.1 (use timeout value of zero).

R7#ping 232.1.1.1 repeat 10000 timeout 0

Type escape sequence to abort.
Sending 10000, 100-byte ICMP Echos to 232.1.1.1, timeout is 0 seconds:
......................................................................
......................................................................

Both R1 and R2 will not display additional P2MP LSP constructed from R2 to R1. This M-LSP is rooted in R1’s Loopback0 and both R1 and R2 are shown as the leaf nodes for it (even though R1 does not need any traffic). R1 forwards packets downstream the P2MP LSP using the TE tunnel between R1 and R2 for recursive forwarding. This ensures traffic protection for multicast data MDT flows.

R1#show mpls mldp database 
* Indicates MLDP recursive forwarding is enabled

LSM ID : 8D000006 Type: P2MP Uptime : 00:00:31
FEC Root : 10.1.1.1 (we are the root)
Opaque decoded : [mdt 100:1 1]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000001
Upstream client(s) :
None
Expires : N/A Path Set ID : C5000007
Replication client(s):
MDT (VRF TEST)
Uptime : 00:00:31 Path Set ID : None
Interface : Lspvif0
10.1.2.2:0
Uptime : 00:00:31 Path Set ID : None
Out label (D) : 23 Interface : Tunnel65336
Local label (U): None Next Hop : 10.1.2.2
...
R2#show mpls mldp database
* Indicates MLDP recursive forwarding is enabled

LSM ID : 81000006 Type: P2MP Uptime : 00:00:24
FEC Root : 10.1.1.1
Opaque decoded : [mdt 100:1 1]
Opaque length : 11 bytes
Opaque value : 07 000B 0001000000000100000001
Upstream client(s) :
10.1.1.1:0 [Active]
Expires : Never Path Set ID : 75000006
Out Label (U) : None Interface : Tunnel65336*
Local Label (D): 23 Next Hop : 10.1.1.1
Replication client(s):
MDT (VRF TEST)
Uptime : 00:00:24 Path Set ID : None
Interface : Lspvif0

Conclusions

MLDP reaches initial deployment phase and could be used for intra-AS M-VPN implementations. Two main advantages of using M-LDP over mGRE transport are the reduction of PIM signaling overhead and the option to protect multicast traffic using MPLE TE FRR. However, unlike source-signaled RSVP-TE based P2MP LSPs, M-LDP signaled LSPs do not support traffic engineering but only recursive routing over TE tunnels. The IOS version used for testing supports M-VPN based on M-LDP signaled P-tunnels (out of band), but does not seem to support in-band signaling of M-LDP LSPs. However, some MLDP commands give and idea that MLDP already supports the opaque types necessary to implement in-band signaling. We may expect to see more extensions added to M-LDP in near future, including CsC option to build P2MP LSPs using proxy root identifiers and Inter-AS extensions to M-LDP based P-Tunnels (segmented tunnels).

Further Reading

Many of the M-LDP features are currently being developed and experimented with. The commands used in this blog post are not yet publically documented and officially announced, but still available in IOS 12.2SR supported for 7200 emulated in Dynamips. The list of recommended reading consists mostly of the recent IETF Drafts:

MPLS LDP Extensions for P2MP LSPs
MPLS/BGP Multicast VPNs
In-Band Signaling with MLDP
MLDP Extensions for CsC Signaling
PIM SM over MLDP
MPLS Upstream Label Assignment and Context-Specific Label Spaces
Blog post: The long Road to M-LSPs

Feb
08

Overview

The purpose of this blog post is to give you a brief overview of M-LSPs, the motivation behind this technology and demonstrate some practical examples. We start with Multicast VPNs and then give an overview of the M-LSP implementations based on M-LDP and RSVP-TE extensions. The practical example is based on the recently added RSVP-TE signaling for establishing P2MP LSPs. All demostrations coud be replicated using the widely available Dynamips hardware emulator. The reader is assumed to have solid understanding of MPLS technologies, including LDP, RSVP-TE, MPLS/BGP VPNs and Multicast VPNs.

Multicast Domains Model

For years, the most popular solution for transporting multicast VPN traffic was using native IP multicast in SP core coupled with dynamic multipoint IP tunneling to isolate customer traffic. Every VPN would be allocated a separate MDT (Multicast Distribution Tree) or domain mapped to a core multicast group. Regular IP multicast routing is then used for forwarding, using PIM signaling in core. Notice, that the core network has to be multicast enabled in Multicast Domains model. This model is well mature by today and allows for effective and stable deployments even in Inter-AS scenarios. However, for the long time the question with MPVNs was – could we replace IP tunneling with MPLS label encapsulation? And the main motivation for moving to MPLS label switching would be leveraging MPLS traffic-engineering and protection features.

Tag Switched Multicast


In early days of MPLS deployment (mid-late 90s) you may have seen Cisco IOS supporting tag switching for multicast traffic. Indeed, you may still find some references to these models in some older RFC drafts:

draft-rfced-info-rekhter-00.txt
draft-farinacci-multicast-tag-part-00

There were no ideas about MPLS TE and FRR back then, just some considerations to use tag encapsulation for fast-switching the multicast packets. Older IOS versions had support for multicast tag switching, which was rather quickly removed, due to compatibility issues (PIM extensions were used for label bindings) and experimental status of the project. Plus, as time passed it became obvious that a more generic concept is required as opposed to pure tag-switched multicast.

Multipoint LSP Overview

Instead of trying to create label-switched multicast, the community came with an idea of Point-to-Multipoint (P2MP) LSPs. The regular, point-to-point (P2P) LSPs are mapped to a single FEC element, which represents a single “destination” (e.g. egress route, IP prefix, Virtual Circuit). On contrary, P2MP LSP has multiple destinations, but a single root - essentially, P2MP LSP represents a tree, with a known root and some opaque “selector” value shared by all destinations. This “selector” could be a multicast group address if you map multicast groups to LSPs, but it is quite generic and is not limited to just multicast groups. P2MP LSPs could be though of as some sort of shortest-path trees built toward the root and using label switching for traffic forwarding. The use of P2MP LSPs could be much wider than simple multicast forwarding, as those two concepts are now completely decoupled. One good example of using P2MP LSPs could be VPLS deployment, where full-mesh of P2P pseudo-wires is replaced with P2MP LSPs rooted at every PE and targeted at all other PEs.

In addition to P2MP, a concept of multipoint-to-multipoint (MP2MP) LSP was introduced. MP2MP LSP has a single "shared" root just like P2MP LSP, but traffic could flow upstream to the root and downstream from the root. Every leaf node may send traffic to every other leaf mode, by using upstream LSP toward the root and downstream P2MP LSP to other leaves. This is in contrary to P2MP LSPs where only root is sending traffic downstream. The MP2MP LSPs could be though as equivalents of bi-directional trees found in PIM. Collectively, P2MP and MP2MP LSPs are named multipoint LSPs.

To complete this overview, we list several specific terms used with M-LSPs to describe the tree components:

  • Headend (root) router – the router that originates P2MP LSP
  • Tailend (leaf) router – the router that terminates the P2MP LSP
  • Midpoint router –the router that swiches P2MP LSP but does not terminate it
  • Bud router – the router that is leaf and midpoint at the same time

M-LDP Solution

Multipoint Extension to LDP defined in draft-ietf-mpls-ldp-p2mp-04 adds the notion of P2MP and MP2MP FEC elements to LDP (defined in RFC 3036) along with respective capability advertisements, label mapping and signaling procedures. The new FEC element have the notion of the LSP root, which is an IPv4 address and a special “opaque” value (the “selector, mentioned above), which groups together the leaf nodes sharing the same opaque value. The opaque value is "transparent" for the intermediate nodes, but has meaning for the root and possibly leaves. Every LDP node advertises its local incoming label binding to the upstream LDP node on the shortest path to the root IPv4 address found in FEC. The upstream node receiving the label bindings creates its own local label(s) and outgoing interfaces. This label allocation process may result in requirements for packet replication, if there are multiple outgoing branches. It is important to notice that an LDP node will merge the label binding for the same opaque value if it founds downstream nodes sharing the same upstream node. This allows for effective building of P2MP LSPs and label conservation.

p2mp-lsp-primer-mldp-signaling

Pay attention to the fact that leaf nodes must NOT allocate an implicit-null label for the P2MP FEC element, i.e. must disable PHP behavior. This is important to allow the leaf node learn the context of incoming packet and properly recognize it as coming from a P2MP LSP. The leaf node may then perform RPF check or properly select the outgoing processing for the packet based on the opaque information associated with the context. This means that the LDP implementation must support context-specific label spaces, as defined in the draft draft-ietf-mpls-upstream-label-02. The need for context-specific processing arises from the fact that M-LSP type is generic and therefore the application is not directly known from the LSP itself. Notice that RFC 3036 has previous defined two "contexts" know as label spaces - those were "interface" and "platform" label spaces.

As we mentioned, MP2MP LSPs are also defined within the scope of M-LDP draft. Signaling for MP2MP LSPs is a bit more complicated, as it involves building P2MP LSP for the shared root and upstream M2MP LSPs from every leaf mode toward the same root. You may find the detailed procedure description in the draft document mentioned above.

Just like multicast multipath procedure, M-LDP supports equal-cost load splitting, in situations where the root node is reachable via multiple upstream paths. In this case, the upstream LDP node is selected using a hash function resulting in load-splitting similar. You may want to read the following document to get a better idea of multicast multipath load splitting: Multicast Load Splitting

M-LDP Features

It is important to notice that LDP is the only protocol required to build the M-LSPs (along with respective IGP, of course). Even if you deploy multicast applications on top of M-LSPs, you don’t need to run PIM in the network core, like it you needed with original tag-switched multicast or when using Multicast Domains model. M-LDP allows for dynamic signaling of M-LSPs, irrespective of their nature. The biggest benefit is reduction of specific signaling protocols in the network core. However, the main problem is that traffic engineering is not native to M-LDP (CR-LDP is R.I.P). However, it seems that Cisco now supports MPLE TE for M-LDP signaled LSPs, at list it is apparent from the newly introduced command set. Also, IP Fast Reroute is supported for LDP singled LSPs, but this feature is not that mature and widely deployed as RSVP-TE implementations. If you are interested in IP FRR procedures, you may consult the documents found under the respective IETF working group. As for implementations, later versions os Cisco IOS XR do support IP FRR.

M-LDP Applications

First and foremost M-LDP could be used for multicast VPNs, replacing multipoint GRE encapsulation and eliminating the need for multicast routing and PIM in the SP network core. Taking the classis MDT domains model, the default MDT could be replaced with a single MP2MP LSP connecting all participating PEs and data MDTs could be mapped to P2MP LSPs signaled toward the participating PEs. The opaque value used could be the group IP address coupled with the respective RD, allowing for effective M-LSP separation in the SP core.
There is another, more direct approach to multicast VPNs with M-LDP. Instead of using the MDT trees, a customer tree could be signaled explicitly across the SP code. When a CE node sends PIM Join toward an (S,G) where S is a VPN address, the PE router will instruct M-LDP to construct the M-LSP toward the root that is the VPNv4 next-hop for the S and using the opaque value of (S,G,RD) which is passed across the core. The PE node that roots the M-LSP interprets the opaque value and geneate PIM Join toward the actual source S.

p2mp-lsp-primer-direct-mdt-model

This model is known as Direct MDT and is more straightforward than the Domains model with respect to the signaling used. However, the drawback is excessive generation of MPLS forwarding states in the network core. A solution to overcome this limitation would be to use hierarchical LSPs, which is another extension to LDP that is being developed. We may expect to see Inter-AS M-LSP extensions in near future too, modeled after the BGP MDT SAFI (e.g. Opaque information propagation) and RPF Proxy Vector features (for finding the correct next hop toward an M-LSP root in another AS). If you want to find out more about those technologies, you may read the following document: Inter-AS mVPNs: MDT SAFI, BGP Connectorand RPF Proxy Vector.

As mentioned above, multicasting is not the only application that could benefit from M-LSPs. Any application that requires traffic flooding or multipoint connection is a good candidate for using M-LSPs. Like we mentioned above, VPLS is the next obvious candidate for using M-LSPs, which may result in optimal broadcast and multicast forwarding in VPLS deployments. You may see the VPLS multicast draft for more details: draft-ietf-l2vpn-vpls-mcast-05

Summary

M-LDP is not yet completely standardized, even though the work has been in progress for many years. Experimental Cisco IOS trains support this feature, but those are not yet available to general public. The main feature of M-LDP signaled M-LSPs is that they don’t require any other protocol aside from LDP, are dynamically initiated by leaf nodes and very generic in respect to the applications that could use them.

RSVP-TE Point-to-Multipoint LSPs

Along with “yet-work-in-progress” M-LDP draft, there is another standard technique for signaling P2MP LSPs using RSVP protocol. This technique has been long time employed in Juniper SP multicast solution and thus is quite mature, compared to M-LDP. Recently, RSVP-signaled P2MP LSPs have been officially added in many Cisco IOS trains, including IOS version for 7200 platform. The RSVP extensions are officially standardized in RFC 4875, which could be found at RFC 4875.

Signaling Procedures

This standard employs the concept of P2MP LSPs statically signaled at head-end router using a fixed list of destinations (leaves). Every destination defines so-called “sub-LSP” construct of the P2MP LSP. The Sub-LSP an LSP originating at the root and terminating at the particular leaf. RSVP signaling for P2MP Sub-LSPs bases on sending RSVP PATH messages to all destinations (sub-LSPs) independently, though multiple sub-LSP descriptions could be carried in a single PATH message. Every leaf node then sends RSVP RESV messages back to the root node of the P2MP LSP, just like in regular RSVP TE. However, the nodes where multiple RSVP RESV messages “merge” should allocate only a single label for those “merging” sub-LSPs and advertise it to the upstream node using either multiple or single RESV messages. Effectively, this “branches” the sub-LSPs out of the “trunk”, even though signaling for both sub-LSPs is performed separately. The whole procedure is somewhat similar to the situation where source uses RSVP for bandwidth reservation for a multicast traffic flow, with flow state merging at the branching point. This is no surprise as the P2MP LSP idea was modeled after multicast distribution shortest-path trees.

p2mp-lsp-primer-rsvp-te-signaling

The only “edge” (that is PE-CE) multicast tree construction protocol supported with RSVP-TE M-LSPs is PIM SSM, as static M-LSPs do not support dynamic source discovery by means of RP. Plus, as we mentioned above, every leaf-node allocated a non-null label for incoming Sub-LSP tunnel. This allows the leaf router to associate the incoming packets with a particular M-LSP and properly perform RPF check using the tunnel source address. Cisco IOS creates a special virtual interface know as LSPVIF and associate the multicast RPF route toward M-LSP tunnel source with this interface. Every new M-LSP will create another LSPVIF interface and another static mroute. In addition to this, you need to manually set static multicast routes for the multicast sources that would be using this tunnel and associate them with the tunnel’s source address. An example will be shown later.

Notice that RSVP-signaled M-LSPs are currently supported only within a single IGP area and there is no support for Inter-AS M-LSPs. However, we may expect those extensions in the future, as they exist for P2P LSPs.

Benefits, Limitations and Applications

The huge benefit of RSVP-signaled P2MP LSPs is that every sub-LSP could be constructed using cSPF, honoring bandwidth, affinity and other constraints. You may also manually specify explicit paths in situations where you use offline traffic-engineering strategy. The actual merging of the sub-LSPs allows for accurate bandwidth usage and avoiding duplicate reservations for the same flow. The second biggest benefit is that since every Sub-LSP is signaled explicitly via RSVP, the whole P2MP LSP construct could be natively protected using RSVP FRR mechanics per RFC 4060. Finally, just like M-LDP, RSVP-signaled M-LSPs do not require running any other signaling protocol such as PIM in the SP network core.

In general, RSVP signaled P2MP LSPs are an excellent solution for multicast IP-TV deployments, as they allow for accurate bandwidth reservation and admission control along with FRR protection. Plus, this solution has been around for quite a while, meaning it has some maturity, as compared to the developing M-LDP. However, the main drawback of RSVP-based solution is the static nature of RSVP-TE signaled P2MP LSPs, which limits their use for Multicast VPN deployments. You may combine P2MP LSPs with Domain-based Multicast VPN solution, but this limits your M-VPNs to a single Default MDT tunneled inside the P2MP LSP.

In addition to IP-TV application, RSVP-TE M-LSPs form a solution to improve multicast handling in VPLS. Effectively, since VPLS by its nature does not involve dynamic setup of pseudowires, the static M-LSP tunnels could be used to optimal traffic forwarding across the SP core. There is an IETF draft outlining this solution: draft-raggarwa-l2vpn-vpls-mcast-01.

M-LSP RSVP-TE Lab Scenario

For a long time, Cisco hold on with releasing the support of RSVP-TE signaled M-LSPs to the wide auditory, probably because the main competitor (Juniper) actively supported and developed this solution. However, there was an implementation used for limited deployments and recently made public with the IOS version of 12.2(33)SRE. The support now includes even the low-end platforms such as 7200VXR, which allows using Dynamips for lab experiments. We will now review a practical example of P2MP LSP setup using the following topology:

p2mp-lsp-primer-lab-scenario

Where R1 sets up an M-LSP to R2, R3 and R4 and configures the tunnel to send multicast traffic down to the nodes. Notice a few details about this scenario. Every node is configured to enable automatic primary one-hop tunnels and NHOP backup tunnels. This is a common way of protecting links in the SP core, which allows for automatic unicast traffic protection. If you are not familiar with this feature, you may read the following document: MPLS TE Auto-Tunnels.

The diagram above displays some of the detour LSPs that are automatically created for protection of “peripheral” links. In the topology outlined every node will originate three primary tunnels and three backup NHOP tunnels. First, look at the template configuration shared by all routes:

Shared Template

!
! Set up auto-tunnels on all routers
!
mpls traffic-eng tunnels
mpls traffic-eng auto-tunnel backup nhop-only
mpls traffic-eng auto-tunnel primary onehop
!
! Enable multicast traffic routing over MPLS TE tunnels
!
ip multicast mpls traffic-eng
!
! Enable fast route withdrawn when an interface goes down
!
ip routing protocol purge interface
!
router ospf 1
network 0.0.0.0 0.0.0.0 area 0
mpls traffic-eng area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng multicast-intact
!
! The below commands are only needed on leaf/root nodes: R1-R4
!
ip multicast-routing
ip pim ssm default

Notice a few interesting commands in this configuration:

  • The command ip multicast mpls traffic-eng enables multicast traffic to be routed over M-LSP tunnels. This still requires configuring a static IGMP join on the M-LSP tunnel interface as the tunnel is unidirectional.
  • The command ip routing protocol purge interface
    is needed on leaf nodes that perform RPF check. When a connected interface goes down, IOS code normally runs special walker process that goes through the RIB and deletes the affected routes. This may take considerable time and affect RPF checks if the Sub-LSP was rerouted over a different incoming interface. The demonstrated command allows for fast purging of RIB by notifying the affected IGPs and allowing the protocol process to quickly clear the associated routes.
  • The command mpls traffic-eng multicast-intact is very important in our context as we activate MPLS TE tunnels and enabled auto-route announces. By default, this may affect RPF check information for the M-LSP tunnel source and force the leaf nodes to drop the tunneled multicast packets. This command should be configured along with the static multicast routes as shown below.

Head-End configuration

R1:
hostname R1
!
interface Loopback 0
ip address 10.1.1.1 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 102
ip address 10.1.12.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.15 point
frame-relay interface-dlci 105
ip address 10.1.15.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.14 point
frame-relay interface-dlci 104
ip address 10.1.14.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface FastEthernet 0/0
ip address 172.16.1.1 255.255.255.0
ip pim passive
no shutdown
!
mpls traffic-eng destination list name P2MP_TO_R2_R3_R4_DYN
ip 10.1.2.2 path-option 10 dynamic
ip 10.1.3.3 path-option 10 dynamic
ip 10.1.4.4 path-option 10 dynamic
!
interface Tunnel 0
description R1 TO R2, R3, R4
ip unnumbered Loopback0
ip pim passive
ip igmp static-group 232.1.1.1 source 172.16.1.1
tunnel mode mpls traffic-eng point-to-multipoint
tunnel destination list mpls traffic-eng name P2MP_TO_R2_R3_R4_DYN

tunnel mpls traffic-eng priority 7 7
tunnel mpls traffic-eng bandwidth 5000
tunnel mpls traffic-eng fast-reroute
  • First notice that PIM is not enabled on any of the core-facing interfaces. This is due to the fact that M-LSP signaling does not require any core multicast set up. Secondly, notice the use of command ip pim passive. This is a long awaited command that enables multicast forwarding out of the interfaces but does not allow to establish any PIM adjacencies.
  • The command mpls traffic-eng destination list defines multiple destinations to be used for the M-LSP. Every destination could have one or more path options. Every path option could be either dynamic or explicit, and we use only dynamic path options for every destination.
  • Notice the M-LSP tunnel configuration. The tunnel mode is set to multipoint and the destination is set to the list created above. Pay attention that PIM passive mode is enabled on the tunnel interface along with a static IGMPv3 join. This allows for the specific multicast flow to be routed down the M-LSP.
  • Lastly, pay attention to the fact that the tunnel is configured with TE attributes, such as bandwidth and enabled for FRR protection. The syntax used is the same as with the P2P LSPs

Midpoint and Tail-End Configuration

R2:
hostname R2
!
interface Loopback 0
ip address 10.1.2.2 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 201
ip address 10.1.12.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.26 point
frame-relay interface-dlci 206
ip address 10.1.26.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.23 point
frame-relay interface-dlci 203
ip address 10.1.23.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface FastEthernet 0/0
ip address 172.16.2.1 255.255.255.0
ip pim passive
ip igmp version 3
ip igmp join-group 232.1.1.1 172.16.1.1
no shutdown
!
ip mroute 172.16.1.0 mask 255.255.255.0 10.1.1.1

The only special command added here is the static multicast route. It provides RPF information for the multicast sources that will be sending traffic across the M-LSP. Notice that the mroute points toward the respective tunnel source, which is mapped to the automatically created LSPVIF interface.

M-LSP RSVP-TE Scenario Verifications

The following is a list of the steps you may apply to verify the scenario configuration.

Step 1

Verify M-LSP setup and labeling

R1#show mpls traffic-eng tunnels dest-mode p2mp brief 
Signalling Summary:
LSP Tunnels Process: running
Passive LSP Listener: running
RSVP Process: running
Forwarding: enabled
auto-tunnel:
backup Enabled (3 ), id-range:65436-65535
onehop Enabled (3 ), id-range:65336-65435
mesh Disabled (0 ), id-range:64336-65335

Periodic reoptimization: every 3600 seconds, next in 515 seconds
Periodic FRR Promotion: Not Running
Periodic auto-tunnel:
primary establish scan: every 10 seconds, next in 3 seconds
primary rm active scan: disabled
backup notinuse scan: every 3600 seconds, next in 640 seconds
Periodic auto-bw collection: every 300 seconds, next in 215 seconds

P2MP TUNNELS:
DEST CURRENT
INTERFACE STATE/PROT UP/CFG TUNID LSPID
Tunnel0 up/up 3/3 0 28
Displayed 1 (of 1) P2MP heads

P2MP SUB-LSPS:
SOURCE TUNID LSPID DESTINATION SUBID STATE UP IF DOWN IF
10.1.1.1 0 28 10.1.2.2 1 Up head Se2/0.12
10.1.1.1 0 28 10.1.3.3 2 Up head Se2/0.12
10.1.1.1 0 28 10.1.4.4 3 Up head Se2/0.14

Displayed 3 P2MP sub-LSPs:
3 (of 3) heads, 0 (of 0) midpoints, 0 (of 0) tails

There are three Sub-LSPs going out of the headed node, toward R2, R3 and R4. You may now check the detailed information including outgoing labels assigned to every Sub-LSP:

R1#show mpls traffic-eng tunnels dest-mode p2mp

P2MP TUNNELS:

Tunnel0 (p2mp), Admin: up, Oper: up
Name: R1 TO R2, R3, R4

Tunnel0 Destinations Information:

Destination State SLSP UpTime
10.1.2.2 Up 00:51:08
10.1.3.3 Up 00:51:08
10.1.4.4 Up 00:51:08

Summary: Destinations: 3 [Up: 3, Proceeding: 0, Down: 0 ]
[destination list name: P2MP_TO_R2_R3_R4_DYN]

History:
Tunnel:
Time since created: 53 minutes, 16 seconds
Time since path change: 51 minutes, 5 seconds
Number of LSP IDs (Tun_Instances) used: 28
Current LSP: [ID: 28]
Uptime: 51 minutes, 8 seconds
Selection: reoptimization
Prior LSP: [ID: 20]
Removal Trigger: re-route path verification failed

Tunnel0 LSP Information:
Configured LSP Parameters:
Bandwidth: 5000 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)

Session Information
Source: 10.1.1.1, TunID: 0

LSPs
ID: 28 (Current), Path-Set ID: 0x90000003
Sub-LSPs: 3, Up: 3, Proceeding: 0, Down: 0

Total LSPs: 1

P2MP SUB-LSPS:

LSP: Source: 10.1.1.1, TunID: 0, LSPID: 28
P2MP ID: 0, Subgroup Originator: 10.1.1.1
Name: R1 TO R2, R3, R4
Bandwidth: 5000, Global Pool

Sub-LSP to 10.1.2.2, P2MP Subgroup ID: 1, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.12, 19
Next Hop : 10.1.12.2
FRR OutLabel : Tunnel65436, 19
Explicit Route: 10.1.12.2 10.1.2.2
Record Route (Path): NONE
Record Route (Resv): 10.1.2.2(19)

Sub-LSP to 10.1.3.3, P2MP Subgroup ID: 2, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.12, 19
Next Hop : 10.1.12.2
FRR OutLabel : Tunnel65436, 19
Explicit Route: 10.1.12.2 10.1.23.3 10.1.3.3
Record Route (Path): NONE
Record Route (Resv): 10.1.2.2(19) 10.1.3.3(20)

Sub-LSP to 10.1.4.4, P2MP Subgroup ID: 3, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.14, 19
Next Hop : 10.1.14.4
FRR OutLabel : Tunnel65437, 19
Explicit Route: 10.1.14.4 10.1.4.4
Record Route (Path): NONE
Record Route (Resv): 10.1.4.4(19)

In the output above notice that that Sub-LSPs to R2 and R3 share the same label and outgoing interface, meaning that R1 has merged the two Sub-LSPs. The output also shows that every Sub-LSP is protected by means of FRR NHOP tunnel. You may notice explicit paths for every Sub-LSP that have been calculated at the headed based on the area topology and constrains. Next, check the multicast routing table at R1 to make sure the M-LSP is used as an outgoing interface for multicast traffic.

R1#show ip mroute 232.1.1.1 
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.1, 232.1.1.1), 00:59:57/stopped, flags: sTI
Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0
Outgoing interface list:
Tunnel0, Forward/Sparse-Dense, 00:58:17/00:01:42

R1#show ip mfib 232.1.1.1
Entry Flags: C - Directly Connected, S - Signal, IA - Inherit A flag,
ET - Data Rate Exceeds Threshold, K - Keepalive
DDE - Data Driven Event, HW - Hardware Installed
I/O Item Flags: IC - Internal Copy, NP - Not platform switched,
NS - Negate Signalling, SP - Signal Present,
A - Accept, F - Forward, RA - MRIB Accept, RF - MRIB Forward
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per second
Other counts: Total/RPF failed/Other drops
I/O Item Counts: FS Pkt Count/PS Pkt Count
Default
(172.16.1.1,232.1.1.1) Flags:
SW Forwarding: 0/0/0/0, Other: 0/0/0
FastEthernet0/0 Flags: A
Tunnel0 Flags: F NS
Pkts: 0/0

Step 2

Look at R2, which is a bud router in this scenario. It acts an a middle-point for the Sub-LSP from R1 to R2 and tail-end for the Sub-LSP to R2. The output below demonstrates this behavior:

R2#show mpls traffic-eng tunnels dest-mode p2mp 

P2MP TUNNELS:

P2MP SUB-LSPS:

LSP: Source: 10.1.1.1, TunID: 0, LSPID: 28
P2MP ID: 0, Subgroup Originator: 10.1.1.1
Name: R1 TO R2, R3, R4
Bandwidth: 5000, Global Pool

Sub-LSP to 10.1.2.2, P2MP Subgroup ID: 1, Role: tail
Path-Set ID: 0x8000003
InLabel : Serial2/0.12, 19
Prev Hop : 10.1.12.1
OutLabel : -
Explicit Route: NONE
Record Route (Path): NONE
Record Route (Resv): NONE

Sub-LSP to 10.1.3.3, P2MP Subgroup ID: 2, Role: midpoint
Path-Set ID: 0x8000003
InLabel : Serial2/0.12, 19
Prev Hop : 10.1.12.1
OutLabel : Serial2/0.23, 20
Next Hop : 10.1.23.3
FRR OutLabel : Tunnel65437, 20
Explicit Route: 10.1.23.3 10.1.3.3
Record Route (Path): NONE
Record Route (Resv): 10.1.3.3(20)

The output shows that Sub-LSP to R2 terminates locally and the Sub-LSP to R3 is label switched using the incoming label 19 and outgoing label 20. This ensures that incoming packets are decapsulated locally and label-switched in parallel. The output below demonstrates that the source 172.16.1.1 is seen on R2 as being connected to LSPVIF0 by the virtue of static multicast route configured. The received packets are replicated out of the connected FastEthernet interface.

R2#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.1, 232.1.1.1), 01:08:31/00:02:37, flags: sLTI
Incoming interface: Lspvif0, RPF nbr 10.1.1.1, Mroute
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 01:08:29/00:02:37

R2#show ip mfib 232.1.1.1
Entry Flags: C - Directly Connected, S - Signal, IA - Inherit A flag,
ET - Data Rate Exceeds Threshold, K - Keepalive
DDE - Data Driven Event, HW - Hardware Installed
I/O Item Flags: IC - Internal Copy, NP - Not platform switched,
NS - Negate Signalling, SP - Signal Present,
A - Accept, F - Forward, RA - MRIB Accept, RF - MRIB Forward
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per second
Other counts: Total/RPF failed/Other drops
I/O Item Counts: FS Pkt Count/PS Pkt Count
Default
(172.16.1.1,232.1.1.1) Flags:
SW Forwarding: 0/0/0/0, Other: 0/0/0
Lspvif0, LSM/0 Flags: A
FastEthernet0/0 Flags: F IC NS
Pkts: 0/0

Step 3

Now it’s time to run a real test. We need a source connected to R1, as if we source process-switched ICMP packets off R1 they will not be MPLS encapsulated. This seems to be a behavior of the newly implemented MFIB-based multicast switching. Notice that the source uses the IP address of 172.16.1.1 for proper SSM multicast flooding:

R7#ping 232.1.1.1 repeat 100

Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 232.1.1.1, timeout is 2 seconds:

Reply to request 0 from 172.16.4.1, 80 ms
Reply to request 0 from 172.16.3.1, 108 ms
Reply to request 1 from 172.16.4.1, 32 ms
Reply to request 1 from 172.16.3.1, 44 ms
Reply to request 1 from 172.16.2.1, 40 ms

If you enable the following debugging on R2 prior to running the pings, you will see the output similar to the one below:

R2#debug ip mfib ps
MFIB IPv4 ps debugging enabled for default IPv4 table

R2#debug mpls packet
Packet debugging is on

MPLS turbo: Se2/0.12: rx: Len 108 Stack {19 0 254} - ipv4 data
MPLS les: Se2/0.12: rx: Len 108 Stack {19 0 254} - ipv4 data
MPLS les: Se2/0.23: tx: Len 108 Stack {20 0 253} - ipv4 data

MFIBv4(0x0): Receive (172.16.1.1,232.1.1.1) from Lspvif0 (FS): hlen 5 prot 1 len 100 ttl 253 frag 0x0
FIBfwd-proc: Default:224.0.0.0/4 multicast entry
FIBipv4-packet-proc: packet routing failed

Per the output above, the packet labeled with label 19 is received and switched out with the label 20, which is in accordance with the output we saw when checking the M-LSP tunnel. Also, you can see MFIB output demonstrating that a multicast packet was received out of LSPVIF interface. The forwarding fails as there are no outgoing encapsulation entries, but the router itself responds to the ICMP echo packet.

Summary

We reviewed the concept of M-LSP – how it evolved from the tag-switched multicast to a generic technology suitable to transport of multipoint traffic flows, not limited to multicast only. We briefly outlined the two main signaling methods for establishing M-LSPs – the multipoint LDP and the RSVP-TE extensions for P2MP LSPs. Both approaches have their benefits and drawbacks, but only the RSVP-TE method has been widely deployed so far and recently offered to public by Cisco. Both methods are still evolving toward the support of broader application sets, such as multicast VPNs and VPLS.

Further Reading

A lot of documents have been mentioned in the document itself. They are now summarized below along with some additional reading.

Tag Switching Architecture Overview (Historical)
Partitioning Tag Space among Multicast Routers on a Common Subnet (Historical)
Multipoint Signaling Extensions to LDP
MPLS Upstream Label Assignment and Context Specific Label Space
Multicast Multipath
Multicasting in VPLS
RFC4875 RSVP-TE Extensions for P2MP LSPS
MPLS TE Auto-Tunnels
MPLS Point-to-Multipoint Traffic Engineering
Using Multipoint LSPs for IP-TV Deployments
Inter-AS mVPNs: MDT SAFI, BGP Connector & RPF Proxy Vector

Jan
17

Abstract:

Inter-AS Multicast VPN solution introduces some challenges in cases where peering systems implement BGP-free core. This post illustrates a known solution to this problem, implemented in Cisco IOS software. The solution involves the use of special MP-BGP and PIM extensions. The reader is assumed to have understanding of basic Cisco's mVPN implementation, PIM-protocol and Multi-Protocol BGP extensions.

Abbreviations used

mVPN – Multicast VPN
MSDP – Multicast Source Discovery Protocol
PE – Provider Edge
CE – Customer Edge
RPF – Reverse Path Forwarding
MP-BGP – Multi-Protocol BGP
PIM – Protocol Independent Multicast
PIM SM – PIM Sparse Mode
PIM SSM – PIM Source Specific Multicast
LDP – Label Distribution Protocol
MDT – Multicast Distribution Tree
P-PIM – Provider Facing PIM Instance
C-PIM – Customer Facing PIM Instance
NLRI – Network Layer Rechability Information

Inter-AS mVPN Overview

A typical "classic" Inter-AS mVPN solution leverages the following key components:

  • PIM-SM maintaining separate RP for every AS
  • MSDP used to exchange information on active multicast sources
  • (Optionally) MP-BGP multicast extension to propagate information about multicast prefixes for RPF validations

With this solution, different PEs participating in the same MDT discover each other by joining the shared tree towards the local RP and listening to the multicast packets sent by other PEs. Those PEs belong to the same or different Autonomous Systems. In the latter case, the sources are discovered by the virtue of MSDP peering. This scenario assumes that every router in the local AS has complete routing information about multicast sources (PE’s loopback addresses) residing in other system. Such information is necessary for the purpose of RPF check. In turn, this leads to the requirement of running BGP on all P routers OR redistributing MP-BGP (BGP) multicast prefixes information into IGP. The redistribution approach clearly has limited scalability, while the other method requires enabling BGP on the P routers, which nullifies with the idea of BGP-free core.

A solution alternative to running PIM in Sparse Mode would be using PIM SSM, which relies on out-of-band information for multicast sources discovery. For such case, Cisco released a draft proposal listing new MP-BGP MDT SAFI that is used to propagate MDT information and associated PE addresses. Let’s make a short detour to MP-BGP to get a better understanding of SAFIs.

MP-BGP Overview

Recall the classic BGP UPDATE message format. It consists of the following sections: [Withdrawn prefixes (Optional)] + [Path Attributes] + [NLRIs]. The Withdrawn Prefixes and NLRIs are IPv4 prefixes, and their structure does not support any other network protocols. The Path Attributes (e.g. AS_PATH, ORIGIN, LOCAL_PREF, NEXT_HOP) are associated with all NLRIs; prefixes sharing different set of path attributes should be carried in a separate UPDATE message. Also, notice that NEXT_HOP is an IPv4 address as well.

In order to introduce support for non-IPv4 network protocols into BGP, two new optional transitive Path Attributes have been added to BGP. The first attribute is known as MP_REACH_NLRI, has the following structure: [AFI/SAFI] + [NEXT_HOP] + [NLRI]. Both NEXT_HOP and NLRI are formatted according to the protocol encoded via AFI/SAFI that stands for Address Family Identifier and Subsequent Address Family Identifier respectively. For example, this could be an IPv6 or CLNS prefix. Thus, all information about non-IPv4 prefixes is encoded in a new BGP Path Attribute. A typical BGP Update message that contains MP_REACH_NLRI attributes would have no “classic” NEXT_HOP attribute and no “Withdrawn Prefixes” or “NLRIs” found in normal UPDATE messages. For the next-hop calculations, a receiving BGP speaker should use the information found in MP_REACH_NLRI attribute. However, the multi-protocol UPDATE message may contain other BGP path attributes such as AS_PATH, ORIGIN, MED, LOCAL_PREF and so on. However, this time those attributes are associated with the non-IPv4 prefixes found in all attached MP_REACH_NLRI attributes.

The second attribute, MP_UNREACH_NLRI has format similar to MP_REACH_NLRI but lists the “multi-protocol” addresses to be removed. No other path attributes need to be associated with this attribute, an UPDATE message may simply contain the list of MP_UNREACH_NLRIs.

The list of supported AFIs may be found in RFC1700 (though it’s obsolete now, it is still very informative) – for example, AFI 1 stands for IPv4, AFI 2 stands for IPv6 etc. The subsequent AFI is needed to clarify the purpose of the information found in MP_REACH_NLRI. For example, SAFI value of 1 means the prefixes should be used for unicast forwarding, SAFI 2 means the prefixes are to be used for multicast RPF checks and SAFI 3 means the prefixes could be used for both purposes. Last, but not least – SAFI of 128 means MPLS labeled VPN address.

Just as a reminder, BGP process would perform separate best-path election process for “classic” IPv4 prefixes and every AFI/SAFI pair prefixes separately, based on the path attributes. This allows for independent route propagation for the addresses found in different address families. Since a given BGP speaker may not support particular network protocols, the list of supported AFI/SAFI pairs is advertised using BGP capabilities feature (another BGP extension), and the particular network protocol information is only propagated if both speakers support it.

MDT SAFI Overview

Cisco drafted a new SAFI to be used with regular AFIs such as IPv4 or IPv6. This SAFI is needed to propagate MDT group address and the associated PE’s Loopback address. The format for AFI 1 (IPv4) is as follows:

MP_NLRI = [RD:PE’s IPv4 Address]:[MDT Group Address],
MP_NEXT_HOP = [BGP Peer IPv4 Address].

Here RD is the RD corresponding to the VRF that has the MDT configured and “IPv4 address” is the respective PE router’s Loopback address. Normally, per Cisco rules this is the same Loopback interface used for VPNv4 peering, but this could be changed using the VRF-level command bgp next-hop.

If all PEs in the local AS exchange this information and pass it to PIM SSM, the P-PIM (Provider PIM, facing the SP core) process will be able to build an (S,G) trees for the MDT group address towards the other PE’s IPv4 addresses. This is by the virtue of the fact that the PE’s IPv4 address is known via IGP, as all PEs are in the same AS. There are no problems using the BGP-free core for intra-AS mVPN with PIM-SSM. Also, it’s worth mentioning that a precursor to MDT was as special extended community used along with VPNv4 address family. MP-BGP would use RD value of 2 (not applicable to any unicast VRF) to transport the associated PE’s IPv4 address along with an extended community that contains the MDT group address. This solution allowed for the “bootstrap” information propagation inside a single AS, since the extended-community was non-transitive. Using extended communities allowed for PIM SSM discovery information distribution inside a single AS. This temporary solution was replaced by the MDT SAFI draft.

Next, consider the case of Inter-AS VPN where at least one AS uses BGP-free core. When two peering Autonomous Systems activate the IPv4 MDT SAFI, the ASBRs will advertise all information learned from PE’s to each other. The information will further propagate down to each AS’s PEs. Next, the P-PIM processes will attempt to build (S, G) trees towards the PE IP addresses in neighboring systems. Even though the PEs may know the other PE's addresses (e.g. if Inter-AS VPN Option C is being used), the P-routers don’t have this information. If Inter-AS VPN Option B is in use, even the PE routers will have no proper information to build the (S, G) trees.

RPF Proxy Vector

The solution to this problem uses a modification to PIM protocol and RPF check functionality. Known as RPF Proxy Vector, it defines a new PIM TLV that contains the IPv4 address of the “proxy” router used for RPF checks and as an intermediate destination for PIM Joins. Let’s see how it works in a particular scenario.

On the diagram below you can see AS 100 and AS 200 using Inter-AS VPN Option B to exchange VPNv4 routes. PEs and ASBRs peer via BGP and exchange VPNv4 and IPv4 MDT SAFI prefixes. For every external prefix relayed to its own PEs, the ASBRs would change the next-hop found MP_REACH_NLRI to its local Loopback address. For VPNv4 prefixes, this achieves the goal of terminating the LSP on the ASBR. For MDT SAFI prefixes, this procedure sets the IPv4 address to be used as “proxy” in PIM Joins.

mvpn-inter-as-basic

Let’s say that MDT group used by R1 and R5 is 232.1.1.1. When R1 receives the MDT SAFI update with the MDT value of 200:1:232.1.1.1, PE IPv4 address 20.0.5.5 and the next-hop value of 10.0.3.3 (R3’ Loopback0 interface) it will pass this information down to PIM process. The PIM process will construct a PIM Join for group 232.1.1.1 towards the IP address 20.0.5.5 (not known in AS 100) and insert a proxy vector value of 10.0.3.3. The PIM process will then use the route to 10.0.3.3 to find the next upstream PIM peer to send the Join message to. Every P-router will process the PIM Join message with the proxy vector, and use the proxy IPv4 address to relay the message upstream. As soon as the message reaches the proxy router (in our case it’s R3), the proxy vector is being removed and the PIM Joins propagate further using the regular procedure, as the domain behind the proxy is supposed to have visibility of the actual Join target.

In addition to use the proxy vector to relay the PIM Join upwards, every routers creates a special mroute state for the (S,G) pair where S is the PE IPv4 address and G is the MDT group. This mroute state will have the proxy IPv4 address associated with it. When a matching multicast packet going from the external PE towards the MDT address hits the router, the RPF check will be performed based on the upstream interface associated with the proxy IPv4 address, not the actual source IPv4 address found in the packet. For example, in our scenario, R2 would have an mroute state for (20.0.5.5, 232.1.1.1) with the proxy IPv4 address of 10.0.3.3. All packets coming from R5 to 232.1.1.1 will be RPF checked based on the upstream interface towards 10.0.3.3.

Using the above-described “proxy-based” procedure, the P routers may successfully perform RPF checks for packets with the source IPv4 addresses not found in the local RIB. The tradeoff is the amount of multicast state information that has to be stored in the P-routers memory – it’s going to be proportional to the number of PEs multiplied by number of mVPN in the worst-case scenario where every PE’s participate in every mVPN. There could be more multicast route states in situations where Data MDT are being used in addition to the Default MDT.

BGP Connector Attribute

Another additional piece of information is needed for “intra-VPN” operations: joining a PIM tree towards a particular IP address inside a VPN and performing an RPF check inside the VPN. Consider the use of Inter-AS VPN Option B, where VPNv4 prefixes have their MP_REACH_NLRI next-hop changed to the local ASBR’s IPv4 address. When a local PE receives a multicast packet on the MDT tunnel interface it decapsulates it and performs a source IPv4 address lookup inside the VRF’s table. Based on MP-BGP learned routes, the next-hop would point towards the ASBR (Option B), while the packets might be coming across a different inter-AS link running multicast MP-BGP peering. Thus, relying solely on the unicast next-hop may not be sufficient for Inter-AS RPF checks.

For example, look at the figure below, where R3 and R4 run MP-BGP for VPN4 while R4 and R6 run multicast MP-BGP extension. R1 peers with both ASBRs and learns VPNv4 prefixes from R3 while it learns as MDT SAFI information and PE’s IPv4 addresses from R6. PIM is enabled only on the link connecting R6 and R4.

mvpn-inter-as-diverse-paths

In this situation, the RPF lookup would fail, as MDT SAFI information is exchanged across the link running M-BGP, while VPNv4 prefixes next-hop point to R3. Thus, a method is required to preserve the information for the RPF lookup.

Cisco suggested the use of a new, optional transitive attribute named BGP Connector to be exported along with VPNv4 prefixes out of PE hosting an mVPN. This attribute contains the following two components: [AFI/SAFI] + [Connector Information] and in general defines information needed by network protocol identified by AFI/SAFI pair to connect to the routing information found in MP_REACH_NLRI. If AFI=IPv4 and SAFI=MDT, the connector attribute contains the IPv4 address of the router originating the prefixes associated with the VRF that has an MDT configured.

The customer-facing PIM process (C-PIM) in the PE routers will use information found in the BGP Connector attribute to perform the intra-VPN RPF check as well as to find the next-hop to send PIM Joins. Notice the C-PIM Joins do not need to have the RPF proxy vector piggy-backed in the PIM messages, as those are transported inside MDT Tunnel towards the remote PEs.

You may notice that the use of BGP Connector attribute eliminates the need for special “VPNv4 multicast” address family that could be used to transport RPF check information for VPNv4 prefixes. The VPNv4 multicast address family is not really needed as the multicast packets are tunneled through SP cores using MDT Tunnel and using the BGP connector is sufficient for RPF-checks at the provider edge. However, the use of mBGP is still needed in situations where diverse unicast and multicast transport paths are used between the Autonomous Systems.

Case Study

Let’s put all concepts to test in a sample scenario. Here, two autonomous systems AS #100 and #200 peer using Inter-AS VPN Option B to exchange VPNv4 prefixes across the link between R3-R4. At the same time, multicast prefixes are exchanged across the peering link R6-R4 along with MDT SAFI attributes. AS 100 implements BGP-free core, so R2 does no peer via BGP with any other routers and only uses OSPF for IGP prefixes exchange. R1 peers with both ASBRs: R3 and R6 via MP-BGP for the purpose of VPNv4 and multicast prefixes and MDT SAFI exchange.

mvpn-case-study

Here on the diagram the links highlighted with orange color are enabled for PIM SM and may carry the multicast traffic. Notice that MPLS traffic and multicast traffic take different paths between the systems. The following are the main R1’s configuration highlights:

  • VRF RED configured with MDT of 232.1.1.1. PIM SSM configured for P-PIM instance using the default group range of 232/8.
  • The command ip multicast vrf RED rpf proxy rd vector
    ensures the use of RPF proxy vector for the MDT built for VRF RED. Thus the MDT tree built using P-PIM would make use of RPF proxy vector. Notice that the command ip multicast rpf proxy vector applies to the Joins received in the global routing table and is typically seen in the P-routers.
  • R1 exchanges VPNv4 prefixes with R3 and multicast prefixes with R6 via BGP. At the same time R1 exchanges IPv4 MDT SAFI with R6 to learn the MDT information from AS 200.
  • Connected routes from VRF RED are redistributed into MP-BGP.
R1:
hostname R1
!
interface Serial 2/0
encapsulation frame-relay
no shutdown
!
ip multicast-routing
ip pim ssm default
!
interface Serial 2/0.12 point-to-point
ip address 10.0.12.1 255.255.255.0
frame-relay interface-dlci 102
mpls ip
ip pim sparse-mode
!
interface Loopback0
ip pim sparse-mode
ip address 10.0.1.1 255.255.255.255
!
router ospf 1
network 10.0.12.1 0.0.0.0 area 0
network 10.0.1.1 0.0.0.0 area 0
!
router bgp 100
neighbor 10.0.3.3 remote-as 100
neighbor 10.0.3.3 update-source Loopback 0
neighbor 10.0.6.6 remote-as 100
neighbor 10.0.6.6 update-source Loopback 0
address-family ipv4 unicast
no neighbor 10.0.3.3 activate
no neighbor 10.0.6.6 activate
address-family vpnv4 unicast
neighbor 10.0.3.3 activate
neighbor 10.0.3.3 send-community both
address-family ipv4 mdt
neighbor 10.0.6.6 activate
address-family ipv4 multicast
neighbor 10.0.6.6 activate
network 10.0.1.1 mask 255.255.255.255
address-family ipv4 vrf RED
redistribute connected
!
no ip domain-lookup
!
ip multicast vrf RED rpf proxy rd vector
!
ip vrf RED
rd 100:1
route-target both 200:1
route-target both 100:1
mdt default 232.1.1.1
!
ip multicast-routing vrf RED
!
interface FastEthernet 0/0
ip vrf forwarding RED
ip address 192.168.1.1 255.255.255.0
ip pim dense-mode
no shutdown

Notice that in the configuration above, Loopback0 interface is used as the source for the MDT tunnel, and therefore has to have PIM (multicast routing) enabled on it. Next in turn, R2’s configuration is straightforward – OSPF used for IGP, adjacencies with R1, R3 and R3 and label exchange via LDP. Notice that R2 does NOT run PIM on the uplink to R3, and does NOT run LDP with R6. Effectively, the path via R6 is used only for multicast traffic while the path across R3 is used only for MPLS LSPs. R2 is configured for PIM-SSM and RPF proxy vector support for global routing table.

R2:
hostname R2
!
no ip domain-lookup
!
interface Serial 2/0
encapsulation frame-relay
no shut
!
ip multicast-routing
!
interface Serial 2/0.12 point-to-point
ip address 10.0.12.2 255.255.255.0
frame-relay interface-dlci 201
mpls ip
ip pim sparse-mode
!
ip pim ssm default
!
interface Serial 2/0.23 point-to-point
ip address 10.0.23.2 255.255.255.0
frame-relay interface-dlci 203
mpls ip
!
interface Serial 2/0.26 point-to-point
ip address 10.0.26.2 255.255.255.0
frame-relay interface-dlci 206
ip pim sparse-mode
!
interface Loopback0
ip address 10.0.2.2 255.255.255.255
!
ip multicast rpf proxy vector
!
router ospf 1
network 10.0.12.2 0.0.0.0 area 0
network 10.0.2.2 0.0.0.0 area 0
network 10.0.23.2 0.0.0.0 area 0
network 10.0.26.2 0.0.0.0 area 0

R3 is the ASBR used for implementing Inter-AS VPN Option B. It peers via BGP with R1 (the PE) and R4 (the other ASBR). Only VPNv4 address family is enabled with the BGP peers. Notice that the next-hop for VPNv4 prefixes is changed to self, in order to terminate the transport LSP from the PE on R3. No multicast or MDT SAFI information is exchanged across R3.

R3:
hostname R3
!
no ip domain-lookup
!
interface Serial 2/0
encapsulation frame-relay
no shut
!
interface Serial 2/0.23 point-to-point
ip address 10.0.23.3 255.255.255.0
frame-relay interface-dlci 302
mpls ip
!
interface Serial 2/0.34 point-to-point
ip address 172.16.34.3 255.255.255.0
frame-relay interface-dlci 304
mpls ip
!
interface Loopback0
ip address 10.0.3.3 255.255.255.255
!
router ospf 1
network 10.0.23.3 0.0.0.0 area 0
network 10.0.3.3 0.0.0.0 area 0
!
router bgp 100
no bgp default route-target filter
neighbor 10.0.1.1 remote-as 100
neighbor 10.0.1.1 update-source Loopback 0
neighbor 172.16.34.4 remote-as 200
address-family ipv4 unicast
no neighbor 10.0.1.1 activate
no neighbor 172.16.34.4 activate
address-family vpnv4 unicast
neighbor 10.0.1.1 activate
neighbor 10.0.1.1 next-hop-self
neighbor 10.0.1.1 send-community both
neighbor 172.16.34.4 activate
neighbor 172.16.34.4 send-community both

The second ASBR in AS 100 – R6, could be characterized as the multicast-only ASBR. In fact, this ASBR is only used to exchange prefixes in multicast and MDT SAFI address families with R1 and R4. MPLS is not enabled on this router, and its sole purpose is multicast forwarding between AS 100 and AS 200. There is no need to run MSDP as PIM SSM is used for multicast trees construction.

R6:
hostname R6
!
no ip domain-lookup
!
interface Serial 2/0
encapsulation frame-relay
no shut
!
ip multicast-routing
ip pim ssm default
ip multicast rpf proxy vector
!
interface Serial 2/0.26 point-to-point
ip address 10.0.26.6 255.255.255.0
frame-relay interface-dlci 602
ip pim sparse-mode
!
interface Serial 2/0.46 point-to-point
ip address 172.16.46.6 255.255.255.0
frame-relay interface-dlci 604
ip pim sparse-mode
!
interface Loopback0
ip pim sparse-mode
ip address 10.0.6.6 255.255.255.255
!
router ospf 1
network 10.0.6.6 0.0.0.0 area 0
network 10.0.26.6 0.0.0.0 area 0
!
router bgp 100
neighbor 10.0.1.1 remote-as 100
neighbor 10.0.1.1 update-source Loopback 0
neighbor 172.16.46.4 remote-as 200
address-family ipv4 unicast
no neighbor 10.0.1.1 activate
no neighbor 172.16.46.4 activate
address-family ipv4 mdt
neighbor 172.16.46.4 activate
neighbor 10.0.1.1 activate
neighbor 10.0.1.1 next-hop-self
address-family ipv4 multicast
neighbor 172.16.46.4 activate
neighbor 10.0.1.1 activate
neighbor 10.0.1.1 next-hop-self

Pay attention to the following. Firstly, R6 is set for PIM SSM and RPF proxy vector support. Secondly, R6 sets itself as the BGP next hop in the updates sent under multicast and MDF SAFI families. This is needed for proper MDT tree construction and correct RPF vector insertion. The next router, R4, is the combined VPN4 and Multicast ASBR for AS 200. It performs the same functions that R3 and R6 perform separately for AS 100. The VPNv4, MDT SAFI, and Multicast address families are enabled under BGP process for this router. At the same time, the router support RPF Proxy Vector and PIM-SSM for proper multicast forwarding. This router is the most configuration-intensive of all routers in both Autonomous Systems, as it also has to support MPLS label propagation via BGP and LDP. Of course, as a classic Option B ASBR, R4 has to change the BGP next-hop to itself for all address families updates sent to R5 – the PE in AS 200.

R4:
hostname R4
!
no ip domain-lookup
!
interface Serial 2/0
encapsulation frame-relay
no shut
!
ip pim ssm default
ip multicast rpf proxy vector
ip multicast-routing
!
interface Serial 2/0.34 point-to-point
ip address 172.16.34.4 255.255.255.0
frame-relay interface-dlci 403
mpls ip
!
interface Serial 2/0.45 point-to-point
ip address 20.0.45.4 255.255.255.0
frame-relay interface-dlci 405
mpls ip
ip pim sparse-mode
!
interface Serial 2/0.46 point-to-point
ip address 172.16.46.4 255.255.255.0
frame-relay interface-dlci 406
ip pim sparse-mode
!
interface Loopback0
ip address 20.0.4.4 255.255.255.255
!
router ospf 1
network 20.0.4.4 0.0.0.0 area 0
network 20.0.45.4 0.0.0.0 area 0
!
router bgp 200
no bgp default route-target filter
neighbor 172.16.34.3 remote-as 100
neighbor 172.16.46.6 remote-as 100
neighbor 20.0.5.5 remote-as 200
neighbor 20.0.5.5 update-source Loopback0
address-family ipv4 unicast
no neighbor 172.16.34.3 activate
no neighbor 20.0.5.5 activate
no neighbor 172.16.46.6 activate
address-family vpnv4 unicast
neighbor 172.16.34.3 activate
neighbor 172.16.34.3 send-community both
neighbor 20.0.5.5 activate
neighbor 20.0.5.5 send-community both
neighbor 20.0.5.5 next-hop-self
address-family ipv4 mdt
neighbor 172.16.46.6 activate
neighbor 20.0.5.5 activate
neighbor 20.0.5.5 next-hop-self
address-family ipv4 multicast
neighbor 20.0.5.5 activate
neighbor 20.0.5.5 next-hop-self
neighbor 172.16.46.6 activate

The last router in the diagram is R5. It’s a PE in AS 200 configured symmetrically to R1. It has to support VPNv4, MDT SAFI and Multicast address families to learn all necessary information from the ASBR. Of course, PIM RPF proxy vector is enabled for VRF RED’s MDT as well as PIM-SSM is configured for the default group range in the global routing table. There is no router in AS 200 that emulates BGP free core, as you may have noticed.

R5:
hostname R5
!
no ip domain-lookup
!
interface Serial 2/0
encapsulation frame-relay
no shut
!
ip multicast-routing
!
interface Serial 2/0.45 point-to-point
ip address 20.0.45.5 255.255.255.0
frame-relay interface-dlci 504
mpls ip
ip pim sparse-mode
!
interface Loopback0
ip pim sparse-mode
ip address 20.0.5.5 255.255.255.255
!
router ospf 1
network 20.0.5.5 0.0.0.0 area 0
network 20.0.45.5 0.0.0.0 area 0
!
ip vrf RED
rd 200:1
route-target both 200:1
route-target both 100:1
mdt default 232.1.1.1
!
router bgp 200
neighbor 20.0.4.4 remote-as 200
neighbor 20.0.4.4 update-source Loopback0
address-family ipv4 unicast
no neighbor 20.0.4.4 activate
address-family vpnv4 unicast
neighbor 20.0.4.4 activate
neighbor 20.0.4.4 send-community both
address-family ipv4 mdt
neighbor 20.0.4.4 activate
address-family ipv4 multicast
neighbor 20.0.4.4 activate
network 20.0.5.5 mask 255.255.255.255
address-family ipv4 vrf RED
redistribute connected
!
ip multicast vrf RED rpf proxy rd vector
ip pim ssm default
!
ip multicast-routing vrf RED
!
interface FastEthernet 0/0
ip vrf forwarding RED
ip address 192.168.5.1 255.255.255.0
ip pim dense-mode
no shutdown

Validating Unicast Paths

This is the simplest part. Use the show commands to see if the VPNv4 prefixes have propagated between the PEs and test end-to-end connectivity:

R1#sh ip route vrf RED

Routing Table: RED
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

B 192.168.5.0/24 [200/0] via 10.0.3.3, 00:50:35
C 192.168.1.0/24 is directly connected, FastEthernet0/0

R1#show bgp vpnv4 unicast vrf RED
BGP table version is 5, local router ID is 10.0.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 100:1 (default for vrf RED)
*> 192.168.1.0 0.0.0.0 0 32768 ?
*>i192.168.5.0 10.0.3.3 0 100 0 200 ?

R1#show bgp vpnv4 unicast vrf RED 192.168.5.0
BGP routing table entry for 100:1:192.168.5.0/24, version 5
Paths: (1 available, best #1, table RED)
Not advertised to any peer
200, imported path from 200:1:192.168.5.0/24
10.0.3.3 (metric 129) from 10.0.3.3 (10.0.3.3)
Origin incomplete, metric 0, localpref 100, valid, internal, best
Extended Community: RT:100:1 RT:200:1
Connector Attribute: count=1
type 1 len 12 value 200:1:20.0.5.5
mpls labels in/out nolabel/23

R1#traceroute vrf RED 192.168.5.1

Type escape sequence to abort.
Tracing the route to 192.168.5.1

1 10.0.12.2 [MPLS: Labels 17/23 Exp 0] 432 msec 36 msec 60 msec
2 10.0.23.3 [MPLS: Label 23 Exp 0] 68 msec 8 msec 36 msec
3 172.16.34.4 [MPLS: Label 19 Exp 0] 64 msec 16 msec 48 msec
4 192.168.5.1 12 msec * 8 msec

Notice that in the output above, the prefix 192.168.5.0/24 has the next-hop value of 10.0.3.3 and the BGP Connector attribute value of 200:1:20.0.5.5. This information will be used for RPF checks further when we start feeding multicast traffic.

Validating Multicast Paths

Multicast forwarding is a bit more complicated. The first thing we should do is making sure the MDTs have been built from R1 towards R5 and from R5 towards R1. Check the PIM MDT groups on every PE:

R1#show ip pim mdt 
MDT Group Interface Source VRF
* 232.1.1.1 Tunnel0 Loopback0 RED
R1#show ip pim mdt bgp
MDT (Route Distinguisher + IPv4) Router ID Next Hop
MDT group 232.1.1.1
200:1:20.0.5.5 10.0.6.6 10.0.6.6

R5#show ip pim mdt
MDT Group Interface Source VRF
* 232.1.1.1 Tunnel0 Loopback0 RED

R5#show ip pim mdt bgp
MDT (Route Distinguisher + IPv4) Router ID Next Hop
MDT group 232.1.1.1
100:1:10.0.1.1 20.0.4.4 20.0.4.4

In the output above, pay attention to the next-hop values found in the MDT BGP information. In AS 100 it points toward R6 while in AS 200 it points to R4. Those next-hops are to be used as the proxy vectors for PIM Join messages. Check the mroutes for the tree (20.0.5.5, 232.1.1.1) starting from R1 and climbing up across R2, R6, R4 to R5:

R1#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(20.0.5.5, 232.1.1.1), 00:58:49/00:02:59, flags: sTIZV
Incoming interface: Serial2/0.12, RPF nbr 10.0.12.2, vector 10.0.6.6
Outgoing interface list:
MVRF RED, Forward/Sparse, 00:58:49/00:01:17

(10.0.1.1, 232.1.1.1), 00:58:49/00:03:19, flags: sT
Incoming interface: Loopback0, RPF nbr 0.0.0.0
Outgoing interface list:
Serial2/0.12, Forward/Sparse, 00:58:47/00:02:55

R1#show ip mroute 232.1.1.1 proxy
(20.0.5.5, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
200:1/10.0.6.6 0.0.0.0 BGP MDT 00:58:51/stopped

R1 shows the RPF proxy value of 10.0.6.6 for the source 20.0.5.5. Notice that there is a tree toward 10.0.1.1, which has been originated from R5. This tree has no proxies, as its now inside its native AS. Next in turn, check R2 and R6 to find the same information (remember that the actual proxy removes the vector when it sees itself in the PIM Join message proxy field):

R2#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(10.0.1.1, 232.1.1.1), 01:01:41/00:03:25, flags: sT
Incoming interface: Serial2/0.12, RPF nbr 10.0.12.1
Outgoing interface list:
Serial2/0.26, Forward/Sparse, 01:01:41/00:02:50

(20.0.5.5, 232.1.1.1), 01:01:43/00:03:25, flags: sTV
Incoming interface: Serial2/0.26, RPF nbr 10.0.26.6, vector 10.0.6.6
Outgoing interface list:
Serial2/0.12, Forward/Sparse, 01:01:43/00:02:56

R2#show ip mroute 232.1.1.1 proxy
(20.0.5.5, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
200:1/10.0.6.6 10.0.12.1 PIM 01:01:46/00:02:23

Notice the same proxy vector for (20.0.5.5, 232.1.1.1) set by R1. As expected, there is “contra-directional” tree built toward R1 from R5, that has no RPF proxy vector. Proceed to the outputs from R6. Notice that R6 knows of two proxy vectors, one of which is R6 itself.

R6#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(10.0.1.1, 232.1.1.1), 01:05:40/00:03:21, flags: sT
Incoming interface: Serial2/0.26, RPF nbr 10.0.26.2
Outgoing interface list:
Serial2/0.46, Forward/Sparse, 01:05:40/00:02:56

(20.0.5.5, 232.1.1.1), 01:05:42/00:03:21, flags: sTV
Incoming interface: Serial2/0.46, RPF nbr 172.16.46.4, vector 172.16.46.4
Outgoing interface list:
Serial2/0.26, Forward/Sparse, 01:05:42/00:02:51

R6#show ip mroute proxy
(10.0.1.1, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
100:1/local 172.16.46.4 PIM 01:05:44/00:02:21

(20.0.5.5, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
200:1/local 10.0.26.2 PIM 01:05:47/00:02:17

The show commands outputs from R4 are similar to R6’s – it’s the proxy for both multicast trees:

R4#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(10.0.1.1, 232.1.1.1), 01:08:42/00:03:16, flags: sTV
Incoming interface: Serial2/0.46, RPF nbr 172.16.46.6, vector 172.16.46.6
Outgoing interface list:
Serial2/0.45, Forward/Sparse, 01:08:42/00:02:51

(20.0.5.5, 232.1.1.1), 01:08:44/00:03:16, flags: sT
Incoming interface: Serial2/0.45, RPF nbr 20.0.45.5
Outgoing interface list:
Serial2/0.46, Forward/Sparse, 01:08:44/00:02:46

R4#show ip mroute proxy
(10.0.1.1, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
100:1/local 20.0.45.5 PIM 01:08:46/00:02:17

(20.0.5.5, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
200:1/local 172.16.46.6 PIM 01:08:48/00:02:12

Finally, the outputs on R5 mirror the ones we saw on R1. However, this time the multicast trees have swapped in their roles: the one toward R1 has the proxy vector set:

R5#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(10.0.1.1, 232.1.1.1), 01:12:07/00:02:57, flags: sTIZV
Incoming interface: Serial2/0.45, RPF nbr 20.0.45.4, vector 20.0.4.4
Outgoing interface list:
MVRF RED, Forward/Sparse, 01:12:07/00:00:02

(20.0.5.5, 232.1.1.1), 01:13:40/00:03:27, flags: sT
Incoming interface: Loopback0, RPF nbr 0.0.0.0
Outgoing interface list:
Serial2/0.45, Forward/Sparse, 01:12:09/00:03:12

R5#show ip mroute proxy
(10.0.1.1, 232.1.1.1)
Proxy Assigner Origin Uptime/Expire
100:1/20.0.4.4 0.0.0.0 BGP MDT 01:12:10/stopped

R5’s BGP table also has the connector attribute information for R1’s Ethernet interface:

R5#show bgp vpnv4 unicast vrf RED 192.168.1.0
BGP routing table entry for 200:1:192.168.1.0/24, version 6
Paths: (1 available, best #1, table RED)
Not advertised to any peer
100, imported path from 100:1:192.168.1.0/24
20.0.4.4 (metric 65) from 20.0.4.4 (20.0.4.4)
Origin incomplete, metric 0, localpref 100, valid, internal, best
Extended Community: RT:100:1 RT:200:1
Connector Attribute: count=1
type 1 len 12 value 100:1:10.0.1.1
mpls labels in/out nolabel/18

In addition to the connector attributes, one last piece of information needed is the multicast source information propagated via IPv4 multicast address family. Both R1 and R5 should have this information in their BGP tables:

R5#show bgp ipv4 multicast
BGP table version is 3, local router ID is 20.0.5.5
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*>i10.0.1.1/32 20.0.4.4 0 100 0 100 i
*> 20.0.5.5/32 0.0.0.0 0 32768 i

R1#show bgp ipv4 multicast
BGP table version is 3, local router ID is 10.0.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*> 10.0.1.1/32 0.0.0.0 0 32768 i
*>i20.0.5.5/32 10.0.6.6 0 100 0 200 i

Now it’s time to verify the multicast connectivity. Make sure R1 and R5 see each other as PIM neighbors over the MDT and than do a multicast ping toward the group joined by both R1 and R5:

R1#show ip pim vrf RED neighbor 
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
S - State Refresh Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
20.0.5.5 Tunnel0 01:31:38/00:01:15 v2 1 / DR S P

R5#show ip pim vrf RED neighbor
PIM Neighbor Table
Mode: B - Bidir Capable, DR - Designated Router, N - Default DR Priority,
S - State Refresh Capable
Neighbor Interface Uptime/Expires Ver DR
Address Prio/Mode
10.0.1.1 Tunnel0 01:31:17/00:01:26 v2 1 / S P

R1#ping vrf RED 239.1.1.1 repeat 100

Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 192.168.1.1, 12 ms
Reply to request 0 from 20.0.5.5, 64 ms
Reply to request 1 from 192.168.1.1, 8 ms
Reply to request 1 from 20.0.5.5, 56 ms
Reply to request 2 from 192.168.1.1, 8 ms
Reply to request 2 from 20.0.5.5, 100 ms
Reply to request 3 from 192.168.1.1, 16 ms
Reply to request 3 from 20.0.5.5, 56 ms

This final verification concludes our testbed verification.

Summary

In this blog post we demonstrated how MP-BGP and PIM extensions could be used to effectively implement Inter-AS multicast VPN between the autonomous systems with BGP-free cores. PIM SSM is used to build the inter-AS trees and MDT SAFI is used to discover the MDT group addresses along with the PEs associated with those. PIM RPF proxy vector allows for successful RPF checks in the multicast-route free core, by the virtue of proxy IPv4 address. Finally, BGP connector attribute allows for successful RPF checks inside a particular VRF.

Further Reading

Multicast VPN (draft-rosen-vpn-mcast)
Multicast Tunnel Discovery (draft-wijnands-mt-discovery)
PIM RPF Vector (draft-ietf-pim-rpf-vector-08)
MDT SAFI (draft-nalawade-idr-mdt-safi)

MDT SAFI Configuration
PIM RPF Proxy Vector Configuration

Subscribe to INE Blog Updates