Oct
28

INE's long awaited CCIE Service Provider Advanced Technologies Class is now available! But first, congratulations to Tedhi Achdiana who just passed the CCIE Service Provider Lab Exam! Here's what Tedhi had to say about his preparation:

Finally i passed my CCIE Service Provider Lab exam in Hongkong on Oct, 17 2011. I used your CCIE Service Provider Printed Materials Bundle. This product makes me deep understand how the Service Provider technology works, so it doesn`t matter when Cisco has changed the SP Blueprint. You just need to practise with IOS XR and finding similiar command in IOS platform.

Thanks to INE and keep good working !

Tedhi Achdiana
CCIE#30949 - Service Provider

The CCIE Service Provider Advanced Technologies Class covers the newest CCIE SP Version 3.0 Blueprint, including the addition of IOS XR hardware. Class topics include Catalyst ME3400 switching, IS-IS, OSPF, BGP, MPLS Layer 3 VPNs (L3VPN), Inter-AS MPLS L3VPNs, IPv6 over MPLS with 6PE and 6VPE, AToM and VPLS based MPLS Layer 2 VPNs (L2VPN), MPLS Traffic Engineering, Service Provider Multicast, and Service Provider QoS. Understanding the topics covered in this class will ensure that students are ready to tackle the next step in their CCIE preparation, applying the technologies themselves with INE's CCIE Service Provider Lab Workbook, and then finally taking and passing the CCIE Service Provider Lab Exam!

Streaming access is available for All Access Pass subscribers for as low as $65/month! Download access can be purchased here for $299. AAP members can additionally upgrade to the download version for $149.

Sample videos from class can be found after the break:

 

The detailed outline of class is as follows:

  • Introduction
  • Catalyst ME3400 Switching
  • Frame Relay / HDLC / PPP & PPPoE
  • IS-IS Overview / Level 1 & Level 2 Routing
  • IS-IS Network Types / Path Selection & Route Leaking
  • IS-IS Route Leaking on IOS XR / IOS XR Routing Policy Language (RPL)
  • IS-IS IPv6 Routing / IS-IS Multi Topology
  • MPLS Overview / LDP Overview
  • Basic MPLS Configuration
  • MPLS Tunnels
  • MPLS Layer 3 VPN (L3VPN) Overview / VRF Overview
  • VPNv4 BGP Overview / Route Distinguishers vs. Route Targets
  • Basic MPLS L3VPN Configuration
  • MPLS L3VPN Verification & Troubleshooting
  • VPNv4 Route Reflectors
  • BGP PE-CE Routing / BGP AS Override
  • RIP PE-CE Routing
  • EIGRP PE-CE Routing
  • OSPF PE-CE Routing / OSPF Domain IDs / Domain Tags & Sham Links
  • OSPF Multi VRF CE Routing
  • MPLS Central Services L3VPNs
  • IPv6 over MPLS with 6PE & 6VPE
  • Inter AS MPLS L3VPN Overview
  • Inter AS MPLS L3VPN Option A - Back-to-Back VRF Exchange Part 1
  • Inter AS MPLS L3VPN Option A - Back-to-Back VRF Exchange Part 2
  • Inter AS MPLS L3VPN Option B - ASBRs Peering VPNv4
  • Inter AS MPLS L3VPN Option C - ASBRs Peering BGP+Label Part 1
  • Inter AS MPLS L3VPN Option C - ASBRs Peering BGP+Label Part 2
  • Carrier Supporting Carrier (CSC) MPLS L3VPN
  • MPLS Layer 2 VPN (L2VPN) Overview
  • Ethernet over MPLS L2VPN AToM
  • PPP & Frame Relay over MPLS L2VPN AToM
  • MPLS L2VPN AToM Interworking
  • Virtual Private LAN Services (VPLS)
  • MPLS Traffic Engineering (TE) Overview
  • MPLS TE Configuration
  • MPLS TE on IOS XR / LDP over MPLS TE Tunnels
  • MPLS TE Fast Reroute (FRR) Link and Node Protection
  • Service Provider Multicast
  • Service Provider QoS

Additionally completely new versions of INE CCIE Service Provider Lab Workbook Volumes I & II are on their way, and should be released before the end of the year. Stay tuned for more information on the workbook and rack rental availability!

Oct
18

One of our most anticipated products of the year - INE's CCIE Service Provider v3.0 Advanced Technologies Class - is now complete!  The videos from class are in the final stages of post production and will be available for streaming and download access later this week.  Download access can be purchased here for $299.  Streaming access is available for All Access Pass subscribers for as low as $65/month!  AAP members can additionally upgrade to the download version for $149.

At roughly 40 hours, the CCIE SPv3 ATC covers the newly released CCIE Service Provider version 3 blueprint, which includes the addition of IOS XR hardware. This class includes both technology lectures and hands on configuration, verification, and troubleshooting on both regular IOS and IOS XR. Class topics include Catalyst ME3400 switching, IS-IS, OSPF, BGP, MPLS Layer 3 VPNs (L3VPN), Inter-AS MPLS L3VPNs, IPv6 over MPLS with 6PE and 6VPE, AToM and VPLS based MPLS Layer 2 VPNs (L2VPN), MPLS Traffic Engineering, Service Provider Multicast, and Service Provider QoS.

Below you can see a sample video from the class, which covers IS-IS Route Leaking, and its implementation on IOS XR with the Routing Policy Language (RPL)

Nov
26

In our CCDP bootcamp, we examined Cisco’s implementation of Virtual Private LAN Services (VPLS) in some detail. One blog that I promised our students was more information about how large enterprises or Internet Service Providers can enhance the scalbility of this solution.

First, let us review the issues that influence its scalability. We covered these in the course, but they are certainly worth repeating here.

Remember that VPLS looks just like an Ethernet switch to the customers. As such, this solution can suffer from the same issues that could hinder a Layer 2 core infrastructure. These are:

  • Control-plane scalability - classic VPLS calls for a full-mesh of pseudo-wires connecting the edge sites. This certainly does not scale as the number of edge sites grow - from both operational and control-plane viewpoints.
  • Network stability as the network grows – Spanning Tree Protocol-based (STP) infrastructures tend not to scale as well as Multiprotocol Label Switching (MPLS) solutions.
  • Ability to recover from outages – as the VPLS network grows, it could become much more susceptible to major issues for customer connectivity in the result of a failure.
  • Multicast and broadcast radiation to all sites – remembering that the VPLS network acts as a Layer 2 switch reminds us that multicast and broadcast traffic can be flooded to all customers across the network.
  • Multicast scalability - multicast traffic has to be replicated on ingress PE devices, which significantly reduces forwarding efficiency.
  • IGP peering scalability issues – all routers attached to the cloud tend to be in the same broadcast domain and thus IGP peer, which results in full-mesh of adjacencies and excessive flooding when using link-state routing protocols.
  • STP loops – it is certainly possible that a customer creating an STP loop could impact other customers of the ISP. STP may be blocked across the MPLS cloud, but it is normally used for multi-homed deployments to prevent forwarding loops.
  • Load-balancing - the use of MPLS encapsulation hides the VPLS encapsulated flows from the core network and thus prevents the effective use of ECMP flow-based load-balancing.

The solution for some of these issues is Hierarchical VPLS (H-VPLS). H-VPLS only interconnects the core MPLS network provider edge devices with a full mesh of pseudo-wires, thus reducing the complexity of the full-mesh. The customer provider edge equipment is then connected hierarchically using pseudo-wires to these core devices. The topology now looks like a reduced full-mesh with tree-like connections at the edge of the network. Notice the customer edge equipment no longer connects to each other.

Using H-VPLS, the problem of control-plane scalability can be significantly alleviated. The network is partitioned into as many edge domains as is required. These edge domains are then very efficiently connected using an MPLS core. The MPLS core for the provider allows for the simultaneous usage of L3 MPLS VPNs for those additional customers that desire it. In addition, the MPLS core may limit the extent of the STP domain in non multi-homed scenarios, and therefore, improves performance and limits instabilities.

It is worth mentioning that the main problem with VPLS deployments is not the control-plane but rather data-plane scalability - mainly because of the MAC address table explosion and excessive broadcast/multicast traffic flooding. In addition to improving control-plane scalability by means of H-VPLS pseudo-wire hierarchies, additional techniques can be used to alleviate the issues of the data-plane. The first technique is MAC-in-MAC address stacking, which reduces the number of MAC addresses to be exposed in the core. The second is known as multicast forwarding optimization. This allows for the use of special point-to-multipoint pseudo-wires in the SP core to improve multicast traffic replication.

To summarize, H-VPLS offers the following benefits compared to traditional VPLS solutions:

  • Lowers the number of pseudo-wire connections that must be full-meshed and thus improves control-plane scalability
  • Reduces the burden on core devices presented by frame replication and forwarding by adding a hierarchical "aggregation" layer
  • Reduces the size of MAC address tables on the provider equipment when combined with MAC-in-MAC address stacking

To complete this blog post, we should mention that there is an alternative to H-VPLS control plane scaling that has been championed by Cisco's rival, Juniper Networks. Juniper's approach utilizes BGP for VPLS pseudo-wire signaling, as opposed to using point-to-point LDP sessions. Scaling BGP by means of route-reflectors is well-known and thus Juniper's approach automatically has a scalable control-plane.

I hope you enjoyed this post. Be sure to watch the blog for more exciting Design-related presentations.

Feb
15

Introduction

Recently, there were discussions going around about Cisco’s new datacenter technology – Overlay Transport Virtualization (OTV), implemented in Nexus 7k data-center switches (limited demo deployments only). The purpose of this technology is connecting separated data-center islands over a convenient packet switched network. It is said that OTV is a better solution compared to well-known VPLS, or any other Layer 2 VPN technology. In this post we are going to give a brief comparison of two technologies and see what benefits OTV may actually bring to data-centers.

VPLS Overview

We are going to give a rather condensed overview of VPLS functionality here, just to have a baseline to compare OTV with. A reader is assumed to have solid understanding or MPLS and Layer 2 VPNs, as technology fundamentals are not described here.

otv-blog-post-vpls

  • VPLS provides multipoint LAN services by extending a LAN cloud over a packet switched network. MPLS is used as a primary transport for tunneling Ethernet frames, however it could be replaced with any suitable tunneling solution, such as GRE or L2TPv3 that runs over a convenient packet switched network. That is, to say at least, VPLS could be transport agnostic, if required. The main benefit of label-switched paths is the ability to leverage MPLS-TE, which is very important to service-providers.
  • The core of VPLS functionality is the process of establishing a full-mesh of pseudowires between all participating PE's. If a VPLS deployment has N PE's, every PE needs to allocate and advertise N-1 different labels to N-1 remote PEs that they should use when encapsulating L2 packets sent to the advertising PE. This is required in order for the advertising PE to be able to distinguish frames coming from different remote PEs and properly perform MAC-address learning.
  • There are two main different methods for MPLS tunnel signaling. Those are full mesh of LDP sessions and full-mesh of BGP peerings. Notice that LDP-based standard does not specify any auto-discovery technique and those could be selected at vendor’s discretion (e.g. RADIUS). BGP-based signaling technique allows for auto-discovery and automatic label allocation using BGP multiprotocol extensions.
  • VPLS utilizes the same data-plane based learning that "classic" Ethernet uses. When a frame is received over a pseudowire, its source MAC address is associated with the “virtual port” and corresponding MPLS label. When a multicast frame or a frame with unknown unicast destination address is received it is flooded out of all ports including the pseudowires (virtual ports).
  • The full mesh of pseudowires allows for simplified forwarding in VPLS core network. Instead of running STP to block redundant connection, split horizon data-plane forwarding rule is used. A frame received on a pseudowire is forwarded only out of physical ports, and not out of any other pseudowires.
  • VPLS does not facilitate control-plane adderess learning, but may facilitate some special signaling to explicitly withdraw MAC addresses from remote PEs when a topology change is detected on the local site. This is especially important in multi-homed scenarios, when MAC address mobility could be a reality.

VPLS Limitations

The following is the list of VPLS limitations, which are inherently rooted in the Ethernet technology:

  • Data-plane learning and flooding are still in use. This fundamental property makes Ethernet plug-and-play technology, but results in poor scalability. Essentially, a single LAN segment is one fault domain, as any change in the topology requires re-flooding of frames and generates sudden bursts of traffic. A topology change in single site may propagate to other sites across the VPLS core (by use of signaling extensions) and cause MAC address re-learning and flooding.
  • Ethernet addressing is inherently flat and non-hierarchical. MAC addresses serve the purpose of identifying endpoints, not pointing their locations. This makes troubleshooting large Ethernet topologies very challenging. In addition to this, MAC addresses are non-summarizable which, coupled with learning & flooding behavior, results in uncontrolled address-table growth for all devices in Ethernet domain.
  • Spanning Tree is still used in CE domains, which results in slower convergence and suboptimal bandwidth usage. Since STP blocks all redundant links the links' bandwidth is effectively wasted. Plus, the links close the root bridge are more congested than the the STP leaves. The use of MSTP or PVST could alleviate this problem, but traffic engineering becomes complicated due to multiple logical topologies.
  • If a customer site is multihomed in VPLS core network, STP needs to be running across the core to ensure blocking of redundant paths, as VPLS does not detect this. This further worsens convergence times and affects stability. There is an alternative, not yet standartized approach for selecting a designated forwarder for multi-homed sites, which does not rely on STP.
  • VPLS does not have native multicast replication support. Multicast frames are flooded out of all pseudowires, even though IGMP snooping may reduce the amount of flooding on local physical ports. Ineffective PE-based multicast replication limits the use of heavyweight multicast applications in VPLS.

Of course, the community came with some solutions to the above problems. Firstly, the problem of address-table growth could be alleviated using MAC-in-MAC (IEEE 802.1ah) encapsulation for customer frames. In this solution, PE devices only have to learn other PE’s MAC addresses, or the MAC-in-MAC stacking devices could be pushed down to the customer network. Even simpler, CE devices could be replaced with routers, thus exposing only a single CE MAC address to the VPLS cloud. Next, there is a lot of work being done for multicast optimization in VPLS. The root cause lies in adapting the main underlying transports – MPLS label switching – to effectively handling multicast traffic. The solution uses point-to-multipoint LSPs and either M-LDP or RSVP-TE extensions to signal those. It is not yet completely standardized and widely adopted, but the work is definitely in progress. However, the main scaling factor of Ethernet – topology agnostic control plane with uncontrolled data-plane learning is not addresses in VPLS extensions so far.

As an alternative to using sophisticated M-LSPs, the multicast problem could be “resolved” by co-locating additional Layer 3 VPN topology and running Layer 3 mVPNs. A “proxy” PIM router deployed at every site will process the IGMP messages and signal multicast trees in the core network. While this solution is not as elegant as using P2MP LSPs and introduces additional operational expenses, but it nonetheless offers a working alternative.

Overlay Transport Virtualization

Regardless the loud name, from technical standpoint OTV looks like nothing more than VPLS stripped off MPLS transport and having optimized multicast handling similar to one used in Draft Rosen's Layer 3 mVPNs. The are some Ethernet flooding behavior improvenments, but those are questionable. Let’s see how OTV works. Notice that this time we use the notion of CE devices, not PE, as OTV is the technology to be deployed at customer’s edge.

otv-blog-post-otv

  • CE devices connect to customer Ethernet clouds and face a provider packet-switching network (or enterprise network core) using a Layer 3 interface. Every CE device creates an overlay tunnel interface that is very much similar to MDT Tunnel interface specified in Draft Rosen. Only IP transport is required in the core and thus the technology does not depend on MPLS transport.
  • Just like with Rosen’s mVPNs all CE’s need to join a P (provider) multicast group to discover other CE's over the overlay tunnel. The CE's then establish a full mesh of IS-IS adjacencies with other CE's on the overlay tunnel. This could be thought as an equivalent for full-mesh of control-plane link in VPLS. The provider multicast group is normally an ASM or Bidir group.
  • IS-IS nodes (CEs) flood LSP information including the MAC addresses known via attached physical Ethernet ports to all other CE’s. This is possible due to flexible TLV structure found in ISIS LSPs. The same LSP flooding could be used to remove or unlearn a MAC address if the local CE finds it unavailable.
  • Ethernet forwarding follows the normal rules, but GRE encapsulation (or any other IP tunneling) is used when sending Ethernet frames over the provided IP cloud to a remote CE. Notice that GRE packets received on overlay tunnel interfaces do NOT result in MAC-address learning for encapsulated frames. Furthermore, unknown unicast frames are NOT flooded out of overlay tunnel interface – it is assumed that all remote MAC addresses are signaled via control plane and should be known.
  • Multicast Ethernet frames are encapsulated using multicast IP packets and forwarded using provider multicast services. Furthermore, it is possible to specify a set of “selective” or “data” multicast groups that are used for flooding specific multicast flows, just like in mVPNs. Upon a reception of an IGMP join, a CE will snoop on it (similar to the classic IGMP snooping) and translate into core multicast group PIM join. All CE's receiving IGMP reports for the same group will join the same core multicast tree and form an optimal multicast distribution structure in the core. The actual multicast flow frames will them get flooded down the selective multicast tree to the participating nodes only. Notice one important difference from mVPNs – the CE devices are not multicast routers, they are effectively virtual switches performing IGMP snooping and extended signaling in provider core.
  • OTV handles multi-homed scenarios properly, without running STP on top the overlay tunnel. If two CE’s share the same logical backdoor link (i.e. they hear each other ISIS hello packets over the link) one of the devices is elected as appointed (aka authoritative) forwarder for the given link (e.g VLAN). Only this device actually floods and forwards the frames on the given segment, thus eliminating Layer 2 forwarding loop. This concept is very similar o electing a PIM Assert winner on a shared link. Notice that this approach is simialar to VPLS draft proposal for multihoming, but uses IGP signaling instead of BGP.
  • OTV supports ARP optimization in order to reduce the amount of broadcast traffic flooded across the overlay tunnel. Every CE may snoop on local ARP replies and use the ISIS extensions to signal IP to MAC bindings to remote nodes. Every CE will then attempt to respond to an ARP request using its local cache, instead of forwarding the ARP packet over the core network. This does not eliminate ARP, just reduces the amount of broadcast flooding

Now for some conclusions. OTV claims to be better than VPLS, but this could be argued. To begin with, VPLS is positioned as provider edge technology and OTV is customer-edge technology. Next, the following list captures similarities and differences between the two technologies:

  • The same logical full-mesh of signaling is used in the core. IS-IS it outlined in the patent document, but any other protocol could be obviously used here, e.g. LDP or BGP. Even the patent document mentions that. What was the reason to re-inventing the wheel? The answer could be “TRILL” as we see in the following section. But so far switching to new signaling makes little sense in terms of benefits.
  • OTV runs over native IP, and does not require underlying MPLS. Like we said before, it was possible to simple change VPLS transport to any IP tunneling technique instead of coming with a new technology. By missing MPLS, OTV loses the important ability to signal optimal path selection in provider networks at the PE edge.
  • Control Plane MAC-address learning is said to reduce broadcast in the core network. This is indeed accomplished but at a significant price. Here is the problem: If a topology change in one site is to be propagated to other sites, control plane must signal the removal of locally learned MAC addresses to the remote sites. Effectively, this will translate data-plane “black-holing” until the MAC addresses are not re-learned and signaled again, as OTV does not flood over the IP core.The things are even worse in control plane. A topology change will flush all MAC addresses known to a CE an result in LSP flooding to all ajdacent nodes. The amount of LSP replication could be optimized using IS-IS mesh-groups, but at least N copies of LSP should be sent, where N is the number of adjacencies. As soon as new MACs are learned, additional LSP will be flooded out to all neighbors! Properly controlling LSP generation, i.e. delaying LSP sending may help reduce flooding but again will result in convergence issues in data plane.

    To summarize, the price paid for flooding reduction is slower convergence in presence of topology changes and control-plane scalability challenges. The main problem – topology unawareness that leads to the need of re-learning MAC address is not addressed in OVT (yet). However, if you think that data-plane flooding in data-centers could be very intensive, the amount of control plane flooding introduced could become acceptable.

  • Optimized multicast support seems to be the only big benefit of OTV that does not result in significant trade-offs. Itroducing native multicast it’s probably due to the fact that VPLS multicasting is still not standardized, while datacenters need it now. The multicast solution is a copy of mVPNs model and not something new and exciting, like M-LSPs are. Like we said before, the same idea could be deployed in VPLS scenarios by means of co-located mVPN. Also, when deployed over SP networks, this feature requires SP multicast support for auto-discovery and optimized forwarding. This is not a major problem though, and OTV has support for unicast static discovery.

To summarize, it looks like OTV is an attempt to fast-track a slightly better “customer” VPLS in datacenters, while IETF folks struggle for actual VPLS standardization. The technology is “CE-centric”, in essence that it does not require any provider intervention with exception to providing multicast and L3 services. It is most likely that OTV and VPLS projects are being carried by different teams that are being time-pressed and thus don’t have resources to coordinate their efforts and come with a unified solution. There are no huge improvements so far in terms of Ethernet optimization, with except to reduced flooding in network core, traded for control-plane complexity. At its current form, OTV might look a bit disappointing, unless is a first step to implementing TRILL (Transport Interconnection for Lots of Links) – new IETF standard for routing bridges.

Meet TRILL

Before we being, it is worth noting that IETF TRILL project somewhat parallels IEEE 802.1aq standard development. Both standards propose replacement of STP with link-state routing protocol. We’ll be discussing mainly RBridges in this post, due to the fact that more open information is available on TRILL. Plus, IETF papers are much easier to read compared to IEEE documents!

Like mentioned above, RBridges is short name for Routing Bridges, the project pioneered by Radia Perlman of Sun Microsystems. If you remember, she is the person who invented the original (DEC) STP protocol. In the new proposal, all bridges become aware of each other and the whole topology by using IS-IS routing protocol. Every bridge has e “nickname” - a 16-bit identified, which addresses the router in the global topology (similar to OSPF router-id). Once the topology is built, the switches operate as following:

otv-blog-post-trill

  • RBridges dynamically learn MAC addresses and associate them with respective VLANs on the customer-facing ports. When a frame needs to be switched out, the RBridges looks up the destination MAC address and finds the remote bridge nickname associated with this MAC. The frame is then encapsulated using three headers. The intermediate RBridges then use the TRILL header for actual hop-by-hop routing to the egress RBridges. Notice that the outer header contains the MAC addresses of two directly connected RBridges, just like it would be in case of routing. The TRILL header has a hop-count field that is similar to IP TTL. Every bridge in the path decrements this field and drops the frame if time to live falls to zero, just like convenient routers do.
  • If the destination address is not known to the ingress RBridge, the frame is flooded in controlled manner, using special destination TRILL address. The shortest-path trees constructed using the link state information are used for flooding and RPF check is performed on every RBridges to eliminate transient routing loops. It is important to point out that every transit RBridges on the path of the flooded frame does not learn the inner MAC address unless it has the inner VLAN locally configured. Thus, transit RBridges limit their address table sizes only to the addresses of other RBridges. Every egress RBridges receiving the TRILL-encapsulated frame will learn the inner source MAC address and associate it with the sending RBridge nickname. This will allow the egress RBridges to properly route the response frames.
  • Any link failure will result in SPF topology re-computations but will not cause MAC-address table flushes unless the particular part of the network is completely isolated by the failure. This is due to the fact that the destination MAC addresses are not associated with a local port but with the remote RBridges that has the destination MAC connected. The RBridges will simply recalculate the routes to the new destination. This behavior is a direct result of topology-aware learning process that we’ll describe later. The net result is that topology changes do not have the devastating effect they have on STP-based networks.
  • On a segment that has multiple RBridges connected, one is elected as “appointed forwarder” for every VLAN. Only the appointed forwarder is allowed to send/receive frames from a shared link, all others remain standby. This is required to ensure loop-free forwarding and avoid excessive flooding. This is very similar to the OTV behavior that we described above.
  • In addition to data-plane MAC address learning, RBridges may explicitly propagate MAC addresses found on a locally connected VLAN to all other RBridges. This is an optional feature, but it allows for faster convergence and less flooding in the topology. The protocol is known as ESADI - End Station Address Distribution Information and looks very similar to control-plane learning found in OTV.
  • The use of shortest-path routing allows for Equal-Cost Multipath Balancing and traffic engineer features found in routed IP network, thus resolving the problem of STP bottlenecks. Additionally, the use of appointed forwarders for every VLAN ensures better load balancing for multi-homed segments - again, a feature copied by OTV.

To summarize, TRILL keeps intact Ethernet’s dynamic data-plane learning feature that made the technology so “plug-and-play”. However, the amount of flooding is now controlled by use of distribution trees and hop counting. The net effect of flooding is significantly reduced due to the fact that topology changes do not flush the MAC address tables. Load balancing is much more effective and deterministic in TRILL networks. TRILL networks are easier to troubleshoot, as every RBridges associates the “flat” MAC address with the “location” in the network defined via the remote bridge nickname. Lastly, the problem of address table growth is somewhat resolved, due to the fact that MAC addresses need not to be known on every switch in the domain, but only on the switches that actually have connection to the end equipment.

Even though TRILL offers some benefits, the MAC address learning and frame flooding remains there. Furthermore, the problem of address space growth is not fully resolved, as with TRILL it results in “core-edge” address table asymmetry. If you are looking for a complete solution for all Ethernet issues, it is recommended to read the paper “Floodless in SEATTLE” (see Further Reading below), which offers significantly re-engineered Ethernet technology utilizing DHT (Distributed Hash Tables) found in peer-to-peer networks and link-state routing. SEATTLE offers truly floodless Ethernet and resolves the address space growth problem by making the global MAC address table work like a distributed database. Thanks to Daniel Ginsburg aka dbg for referring me to this wonderful reading!

Conclusions

Right now, OTV looks like an attempt to rapidly deploy VPLS functionality without relying on MPLS transport. This is probably driven by the growth need for deploying large data-centers and interconnecting them across convenient packet-switched networks at customer edge. OTV reduces flooding in network cores but makes convergence process slower in the presence of topology changes. Multicast traffic is forwarded in optimal manner using core network’s multicast services. If OTV is a first step toward TRILL, then it looks like a very promising technology. Otherwise it is just a VPLS replica with some optimizations. Still, I hope that one day OTV and VPLS branches will be merged and TRILL would become implemented in one common VPLS framework!

Further Reading

OTV Patent Paper
RBridges Draft Document
SEATTLE Technology
VPLS using LDP Signaling
VPLS using BGP Signaling
Multicasting in VPLS
Multihoming in VPLS
Multicast VPNs using Draft Rosen
Introduction to M-LSPs and Practical Examples

Feb
08

Overview

The purpose of this blog post is to give you a brief overview of M-LSPs, the motivation behind this technology and demonstrate some practical examples. We start with Multicast VPNs and then give an overview of the M-LSP implementations based on M-LDP and RSVP-TE extensions. The practical example is based on the recently added RSVP-TE signaling for establishing P2MP LSPs. All demostrations coud be replicated using the widely available Dynamips hardware emulator. The reader is assumed to have solid understanding of MPLS technologies, including LDP, RSVP-TE, MPLS/BGP VPNs and Multicast VPNs.

Multicast Domains Model

For years, the most popular solution for transporting multicast VPN traffic was using native IP multicast in SP core coupled with dynamic multipoint IP tunneling to isolate customer traffic. Every VPN would be allocated a separate MDT (Multicast Distribution Tree) or domain mapped to a core multicast group. Regular IP multicast routing is then used for forwarding, using PIM signaling in core. Notice, that the core network has to be multicast enabled in Multicast Domains model. This model is well mature by today and allows for effective and stable deployments even in Inter-AS scenarios. However, for the long time the question with MPVNs was – could we replace IP tunneling with MPLS label encapsulation? And the main motivation for moving to MPLS label switching would be leveraging MPLS traffic-engineering and protection features.

Tag Switched Multicast


In early days of MPLS deployment (mid-late 90s) you may have seen Cisco IOS supporting tag switching for multicast traffic. Indeed, you may still find some references to these models in some older RFC drafts:

draft-rfced-info-rekhter-00.txt
draft-farinacci-multicast-tag-part-00

There were no ideas about MPLS TE and FRR back then, just some considerations to use tag encapsulation for fast-switching the multicast packets. Older IOS versions had support for multicast tag switching, which was rather quickly removed, due to compatibility issues (PIM extensions were used for label bindings) and experimental status of the project. Plus, as time passed it became obvious that a more generic concept is required as opposed to pure tag-switched multicast.

Multipoint LSP Overview

Instead of trying to create label-switched multicast, the community came with an idea of Point-to-Multipoint (P2MP) LSPs. The regular, point-to-point (P2P) LSPs are mapped to a single FEC element, which represents a single “destination” (e.g. egress route, IP prefix, Virtual Circuit). On contrary, P2MP LSP has multiple destinations, but a single root - essentially, P2MP LSP represents a tree, with a known root and some opaque “selector” value shared by all destinations. This “selector” could be a multicast group address if you map multicast groups to LSPs, but it is quite generic and is not limited to just multicast groups. P2MP LSPs could be though of as some sort of shortest-path trees built toward the root and using label switching for traffic forwarding. The use of P2MP LSPs could be much wider than simple multicast forwarding, as those two concepts are now completely decoupled. One good example of using P2MP LSPs could be VPLS deployment, where full-mesh of P2P pseudo-wires is replaced with P2MP LSPs rooted at every PE and targeted at all other PEs.

In addition to P2MP, a concept of multipoint-to-multipoint (MP2MP) LSP was introduced. MP2MP LSP has a single "shared" root just like P2MP LSP, but traffic could flow upstream to the root and downstream from the root. Every leaf node may send traffic to every other leaf mode, by using upstream LSP toward the root and downstream P2MP LSP to other leaves. This is in contrary to P2MP LSPs where only root is sending traffic downstream. The MP2MP LSPs could be though as equivalents of bi-directional trees found in PIM. Collectively, P2MP and MP2MP LSPs are named multipoint LSPs.

To complete this overview, we list several specific terms used with M-LSPs to describe the tree components:

  • Headend (root) router – the router that originates P2MP LSP
  • Tailend (leaf) router – the router that terminates the P2MP LSP
  • Midpoint router –the router that swiches P2MP LSP but does not terminate it
  • Bud router – the router that is leaf and midpoint at the same time

M-LDP Solution

Multipoint Extension to LDP defined in draft-ietf-mpls-ldp-p2mp-04 adds the notion of P2MP and MP2MP FEC elements to LDP (defined in RFC 3036) along with respective capability advertisements, label mapping and signaling procedures. The new FEC element have the notion of the LSP root, which is an IPv4 address and a special “opaque” value (the “selector, mentioned above), which groups together the leaf nodes sharing the same opaque value. The opaque value is "transparent" for the intermediate nodes, but has meaning for the root and possibly leaves. Every LDP node advertises its local incoming label binding to the upstream LDP node on the shortest path to the root IPv4 address found in FEC. The upstream node receiving the label bindings creates its own local label(s) and outgoing interfaces. This label allocation process may result in requirements for packet replication, if there are multiple outgoing branches. It is important to notice that an LDP node will merge the label binding for the same opaque value if it founds downstream nodes sharing the same upstream node. This allows for effective building of P2MP LSPs and label conservation.

p2mp-lsp-primer-mldp-signaling

Pay attention to the fact that leaf nodes must NOT allocate an implicit-null label for the P2MP FEC element, i.e. must disable PHP behavior. This is important to allow the leaf node learn the context of incoming packet and properly recognize it as coming from a P2MP LSP. The leaf node may then perform RPF check or properly select the outgoing processing for the packet based on the opaque information associated with the context. This means that the LDP implementation must support context-specific label spaces, as defined in the draft draft-ietf-mpls-upstream-label-02. The need for context-specific processing arises from the fact that M-LSP type is generic and therefore the application is not directly known from the LSP itself. Notice that RFC 3036 has previous defined two "contexts" know as label spaces - those were "interface" and "platform" label spaces.

As we mentioned, MP2MP LSPs are also defined within the scope of M-LDP draft. Signaling for MP2MP LSPs is a bit more complicated, as it involves building P2MP LSP for the shared root and upstream M2MP LSPs from every leaf mode toward the same root. You may find the detailed procedure description in the draft document mentioned above.

Just like multicast multipath procedure, M-LDP supports equal-cost load splitting, in situations where the root node is reachable via multiple upstream paths. In this case, the upstream LDP node is selected using a hash function resulting in load-splitting similar. You may want to read the following document to get a better idea of multicast multipath load splitting: Multicast Load Splitting

M-LDP Features

It is important to notice that LDP is the only protocol required to build the M-LSPs (along with respective IGP, of course). Even if you deploy multicast applications on top of M-LSPs, you don’t need to run PIM in the network core, like it you needed with original tag-switched multicast or when using Multicast Domains model. M-LDP allows for dynamic signaling of M-LSPs, irrespective of their nature. The biggest benefit is reduction of specific signaling protocols in the network core. However, the main problem is that traffic engineering is not native to M-LDP (CR-LDP is R.I.P). However, it seems that Cisco now supports MPLE TE for M-LDP signaled LSPs, at list it is apparent from the newly introduced command set. Also, IP Fast Reroute is supported for LDP singled LSPs, but this feature is not that mature and widely deployed as RSVP-TE implementations. If you are interested in IP FRR procedures, you may consult the documents found under the respective IETF working group. As for implementations, later versions os Cisco IOS XR do support IP FRR.

M-LDP Applications

First and foremost M-LDP could be used for multicast VPNs, replacing multipoint GRE encapsulation and eliminating the need for multicast routing and PIM in the SP network core. Taking the classis MDT domains model, the default MDT could be replaced with a single MP2MP LSP connecting all participating PEs and data MDTs could be mapped to P2MP LSPs signaled toward the participating PEs. The opaque value used could be the group IP address coupled with the respective RD, allowing for effective M-LSP separation in the SP core.
There is another, more direct approach to multicast VPNs with M-LDP. Instead of using the MDT trees, a customer tree could be signaled explicitly across the SP code. When a CE node sends PIM Join toward an (S,G) where S is a VPN address, the PE router will instruct M-LDP to construct the M-LSP toward the root that is the VPNv4 next-hop for the S and using the opaque value of (S,G,RD) which is passed across the core. The PE node that roots the M-LSP interprets the opaque value and geneate PIM Join toward the actual source S.

p2mp-lsp-primer-direct-mdt-model

This model is known as Direct MDT and is more straightforward than the Domains model with respect to the signaling used. However, the drawback is excessive generation of MPLS forwarding states in the network core. A solution to overcome this limitation would be to use hierarchical LSPs, which is another extension to LDP that is being developed. We may expect to see Inter-AS M-LSP extensions in near future too, modeled after the BGP MDT SAFI (e.g. Opaque information propagation) and RPF Proxy Vector features (for finding the correct next hop toward an M-LSP root in another AS). If you want to find out more about those technologies, you may read the following document: Inter-AS mVPNs: MDT SAFI, BGP Connectorand RPF Proxy Vector.

As mentioned above, multicasting is not the only application that could benefit from M-LSPs. Any application that requires traffic flooding or multipoint connection is a good candidate for using M-LSPs. Like we mentioned above, VPLS is the next obvious candidate for using M-LSPs, which may result in optimal broadcast and multicast forwarding in VPLS deployments. You may see the VPLS multicast draft for more details: draft-ietf-l2vpn-vpls-mcast-05

Summary

M-LDP is not yet completely standardized, even though the work has been in progress for many years. Experimental Cisco IOS trains support this feature, but those are not yet available to general public. The main feature of M-LDP signaled M-LSPs is that they don’t require any other protocol aside from LDP, are dynamically initiated by leaf nodes and very generic in respect to the applications that could use them.

RSVP-TE Point-to-Multipoint LSPs

Along with “yet-work-in-progress” M-LDP draft, there is another standard technique for signaling P2MP LSPs using RSVP protocol. This technique has been long time employed in Juniper SP multicast solution and thus is quite mature, compared to M-LDP. Recently, RSVP-signaled P2MP LSPs have been officially added in many Cisco IOS trains, including IOS version for 7200 platform. The RSVP extensions are officially standardized in RFC 4875, which could be found at RFC 4875.

Signaling Procedures

This standard employs the concept of P2MP LSPs statically signaled at head-end router using a fixed list of destinations (leaves). Every destination defines so-called “sub-LSP” construct of the P2MP LSP. The Sub-LSP an LSP originating at the root and terminating at the particular leaf. RSVP signaling for P2MP Sub-LSPs bases on sending RSVP PATH messages to all destinations (sub-LSPs) independently, though multiple sub-LSP descriptions could be carried in a single PATH message. Every leaf node then sends RSVP RESV messages back to the root node of the P2MP LSP, just like in regular RSVP TE. However, the nodes where multiple RSVP RESV messages “merge” should allocate only a single label for those “merging” sub-LSPs and advertise it to the upstream node using either multiple or single RESV messages. Effectively, this “branches” the sub-LSPs out of the “trunk”, even though signaling for both sub-LSPs is performed separately. The whole procedure is somewhat similar to the situation where source uses RSVP for bandwidth reservation for a multicast traffic flow, with flow state merging at the branching point. This is no surprise as the P2MP LSP idea was modeled after multicast distribution shortest-path trees.

p2mp-lsp-primer-rsvp-te-signaling

The only “edge” (that is PE-CE) multicast tree construction protocol supported with RSVP-TE M-LSPs is PIM SSM, as static M-LSPs do not support dynamic source discovery by means of RP. Plus, as we mentioned above, every leaf-node allocated a non-null label for incoming Sub-LSP tunnel. This allows the leaf router to associate the incoming packets with a particular M-LSP and properly perform RPF check using the tunnel source address. Cisco IOS creates a special virtual interface know as LSPVIF and associate the multicast RPF route toward M-LSP tunnel source with this interface. Every new M-LSP will create another LSPVIF interface and another static mroute. In addition to this, you need to manually set static multicast routes for the multicast sources that would be using this tunnel and associate them with the tunnel’s source address. An example will be shown later.

Notice that RSVP-signaled M-LSPs are currently supported only within a single IGP area and there is no support for Inter-AS M-LSPs. However, we may expect those extensions in the future, as they exist for P2P LSPs.

Benefits, Limitations and Applications

The huge benefit of RSVP-signaled P2MP LSPs is that every sub-LSP could be constructed using cSPF, honoring bandwidth, affinity and other constraints. You may also manually specify explicit paths in situations where you use offline traffic-engineering strategy. The actual merging of the sub-LSPs allows for accurate bandwidth usage and avoiding duplicate reservations for the same flow. The second biggest benefit is that since every Sub-LSP is signaled explicitly via RSVP, the whole P2MP LSP construct could be natively protected using RSVP FRR mechanics per RFC 4060. Finally, just like M-LDP, RSVP-signaled M-LSPs do not require running any other signaling protocol such as PIM in the SP network core.

In general, RSVP signaled P2MP LSPs are an excellent solution for multicast IP-TV deployments, as they allow for accurate bandwidth reservation and admission control along with FRR protection. Plus, this solution has been around for quite a while, meaning it has some maturity, as compared to the developing M-LDP. However, the main drawback of RSVP-based solution is the static nature of RSVP-TE signaled P2MP LSPs, which limits their use for Multicast VPN deployments. You may combine P2MP LSPs with Domain-based Multicast VPN solution, but this limits your M-VPNs to a single Default MDT tunneled inside the P2MP LSP.

In addition to IP-TV application, RSVP-TE M-LSPs form a solution to improve multicast handling in VPLS. Effectively, since VPLS by its nature does not involve dynamic setup of pseudowires, the static M-LSP tunnels could be used to optimal traffic forwarding across the SP core. There is an IETF draft outlining this solution: draft-raggarwa-l2vpn-vpls-mcast-01.

M-LSP RSVP-TE Lab Scenario

For a long time, Cisco hold on with releasing the support of RSVP-TE signaled M-LSPs to the wide auditory, probably because the main competitor (Juniper) actively supported and developed this solution. However, there was an implementation used for limited deployments and recently made public with the IOS version of 12.2(33)SRE. The support now includes even the low-end platforms such as 7200VXR, which allows using Dynamips for lab experiments. We will now review a practical example of P2MP LSP setup using the following topology:

p2mp-lsp-primer-lab-scenario

Where R1 sets up an M-LSP to R2, R3 and R4 and configures the tunnel to send multicast traffic down to the nodes. Notice a few details about this scenario. Every node is configured to enable automatic primary one-hop tunnels and NHOP backup tunnels. This is a common way of protecting links in the SP core, which allows for automatic unicast traffic protection. If you are not familiar with this feature, you may read the following document: MPLS TE Auto-Tunnels.

The diagram above displays some of the detour LSPs that are automatically created for protection of “peripheral” links. In the topology outlined every node will originate three primary tunnels and three backup NHOP tunnels. First, look at the template configuration shared by all routes:

Shared Template

!
! Set up auto-tunnels on all routers
!
mpls traffic-eng tunnels
mpls traffic-eng auto-tunnel backup nhop-only
mpls traffic-eng auto-tunnel primary onehop
!
! Enable multicast traffic routing over MPLS TE tunnels
!
ip multicast mpls traffic-eng
!
! Enable fast route withdrawn when an interface goes down
!
ip routing protocol purge interface
!
router ospf 1
network 0.0.0.0 0.0.0.0 area 0
mpls traffic-eng area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng multicast-intact
!
! The below commands are only needed on leaf/root nodes: R1-R4
!
ip multicast-routing
ip pim ssm default

Notice a few interesting commands in this configuration:

  • The command ip multicast mpls traffic-eng enables multicast traffic to be routed over M-LSP tunnels. This still requires configuring a static IGMP join on the M-LSP tunnel interface as the tunnel is unidirectional.
  • The command ip routing protocol purge interface
    is needed on leaf nodes that perform RPF check. When a connected interface goes down, IOS code normally runs special walker process that goes through the RIB and deletes the affected routes. This may take considerable time and affect RPF checks if the Sub-LSP was rerouted over a different incoming interface. The demonstrated command allows for fast purging of RIB by notifying the affected IGPs and allowing the protocol process to quickly clear the associated routes.
  • The command mpls traffic-eng multicast-intact is very important in our context as we activate MPLS TE tunnels and enabled auto-route announces. By default, this may affect RPF check information for the M-LSP tunnel source and force the leaf nodes to drop the tunneled multicast packets. This command should be configured along with the static multicast routes as shown below.

Head-End configuration

R1:
hostname R1
!
interface Loopback 0
ip address 10.1.1.1 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 102
ip address 10.1.12.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.15 point
frame-relay interface-dlci 105
ip address 10.1.15.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.14 point
frame-relay interface-dlci 104
ip address 10.1.14.1 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface FastEthernet 0/0
ip address 172.16.1.1 255.255.255.0
ip pim passive
no shutdown
!
mpls traffic-eng destination list name P2MP_TO_R2_R3_R4_DYN
ip 10.1.2.2 path-option 10 dynamic
ip 10.1.3.3 path-option 10 dynamic
ip 10.1.4.4 path-option 10 dynamic
!
interface Tunnel 0
description R1 TO R2, R3, R4
ip unnumbered Loopback0
ip pim passive
ip igmp static-group 232.1.1.1 source 172.16.1.1
tunnel mode mpls traffic-eng point-to-multipoint
tunnel destination list mpls traffic-eng name P2MP_TO_R2_R3_R4_DYN

tunnel mpls traffic-eng priority 7 7
tunnel mpls traffic-eng bandwidth 5000
tunnel mpls traffic-eng fast-reroute
  • First notice that PIM is not enabled on any of the core-facing interfaces. This is due to the fact that M-LSP signaling does not require any core multicast set up. Secondly, notice the use of command ip pim passive. This is a long awaited command that enables multicast forwarding out of the interfaces but does not allow to establish any PIM adjacencies.
  • The command mpls traffic-eng destination list defines multiple destinations to be used for the M-LSP. Every destination could have one or more path options. Every path option could be either dynamic or explicit, and we use only dynamic path options for every destination.
  • Notice the M-LSP tunnel configuration. The tunnel mode is set to multipoint and the destination is set to the list created above. Pay attention that PIM passive mode is enabled on the tunnel interface along with a static IGMPv3 join. This allows for the specific multicast flow to be routed down the M-LSP.
  • Lastly, pay attention to the fact that the tunnel is configured with TE attributes, such as bandwidth and enabled for FRR protection. The syntax used is the same as with the P2P LSPs

Midpoint and Tail-End Configuration

R2:
hostname R2
!
interface Loopback 0
ip address 10.1.2.2 255.255.255.255
!
interface Serial 2/0
no shutdown
encapsulation frame-relay
no frame-relay inverse
!
interface Serial 2/0.12 point
frame-relay interface-dlci 201
ip address 10.1.12.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.26 point
frame-relay interface-dlci 206
ip address 10.1.26.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface Serial 2/0.23 point
frame-relay interface-dlci 203
ip address 10.1.23.2 255.255.255.0
bandwidth 10000
mpls traffic-eng tunnels
ip rsvp bandwidth 10000
!
interface FastEthernet 0/0
ip address 172.16.2.1 255.255.255.0
ip pim passive
ip igmp version 3
ip igmp join-group 232.1.1.1 172.16.1.1
no shutdown
!
ip mroute 172.16.1.0 mask 255.255.255.0 10.1.1.1

The only special command added here is the static multicast route. It provides RPF information for the multicast sources that will be sending traffic across the M-LSP. Notice that the mroute points toward the respective tunnel source, which is mapped to the automatically created LSPVIF interface.

M-LSP RSVP-TE Scenario Verifications

The following is a list of the steps you may apply to verify the scenario configuration.

Step 1

Verify M-LSP setup and labeling

R1#show mpls traffic-eng tunnels dest-mode p2mp brief 
Signalling Summary:
LSP Tunnels Process: running
Passive LSP Listener: running
RSVP Process: running
Forwarding: enabled
auto-tunnel:
backup Enabled (3 ), id-range:65436-65535
onehop Enabled (3 ), id-range:65336-65435
mesh Disabled (0 ), id-range:64336-65335

Periodic reoptimization: every 3600 seconds, next in 515 seconds
Periodic FRR Promotion: Not Running
Periodic auto-tunnel:
primary establish scan: every 10 seconds, next in 3 seconds
primary rm active scan: disabled
backup notinuse scan: every 3600 seconds, next in 640 seconds
Periodic auto-bw collection: every 300 seconds, next in 215 seconds

P2MP TUNNELS:
DEST CURRENT
INTERFACE STATE/PROT UP/CFG TUNID LSPID
Tunnel0 up/up 3/3 0 28
Displayed 1 (of 1) P2MP heads

P2MP SUB-LSPS:
SOURCE TUNID LSPID DESTINATION SUBID STATE UP IF DOWN IF
10.1.1.1 0 28 10.1.2.2 1 Up head Se2/0.12
10.1.1.1 0 28 10.1.3.3 2 Up head Se2/0.12
10.1.1.1 0 28 10.1.4.4 3 Up head Se2/0.14

Displayed 3 P2MP sub-LSPs:
3 (of 3) heads, 0 (of 0) midpoints, 0 (of 0) tails

There are three Sub-LSPs going out of the headed node, toward R2, R3 and R4. You may now check the detailed information including outgoing labels assigned to every Sub-LSP:

R1#show mpls traffic-eng tunnels dest-mode p2mp

P2MP TUNNELS:

Tunnel0 (p2mp), Admin: up, Oper: up
Name: R1 TO R2, R3, R4

Tunnel0 Destinations Information:

Destination State SLSP UpTime
10.1.2.2 Up 00:51:08
10.1.3.3 Up 00:51:08
10.1.4.4 Up 00:51:08

Summary: Destinations: 3 [Up: 3, Proceeding: 0, Down: 0 ]
[destination list name: P2MP_TO_R2_R3_R4_DYN]

History:
Tunnel:
Time since created: 53 minutes, 16 seconds
Time since path change: 51 minutes, 5 seconds
Number of LSP IDs (Tun_Instances) used: 28
Current LSP: [ID: 28]
Uptime: 51 minutes, 8 seconds
Selection: reoptimization
Prior LSP: [ID: 20]
Removal Trigger: re-route path verification failed

Tunnel0 LSP Information:
Configured LSP Parameters:
Bandwidth: 5000 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF
Metric Type: TE (default)

Session Information
Source: 10.1.1.1, TunID: 0

LSPs
ID: 28 (Current), Path-Set ID: 0x90000003
Sub-LSPs: 3, Up: 3, Proceeding: 0, Down: 0

Total LSPs: 1

P2MP SUB-LSPS:

LSP: Source: 10.1.1.1, TunID: 0, LSPID: 28
P2MP ID: 0, Subgroup Originator: 10.1.1.1
Name: R1 TO R2, R3, R4
Bandwidth: 5000, Global Pool

Sub-LSP to 10.1.2.2, P2MP Subgroup ID: 1, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.12, 19
Next Hop : 10.1.12.2
FRR OutLabel : Tunnel65436, 19
Explicit Route: 10.1.12.2 10.1.2.2
Record Route (Path): NONE
Record Route (Resv): 10.1.2.2(19)

Sub-LSP to 10.1.3.3, P2MP Subgroup ID: 2, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.12, 19
Next Hop : 10.1.12.2
FRR OutLabel : Tunnel65436, 19
Explicit Route: 10.1.12.2 10.1.23.3 10.1.3.3
Record Route (Path): NONE
Record Route (Resv): 10.1.2.2(19) 10.1.3.3(20)

Sub-LSP to 10.1.4.4, P2MP Subgroup ID: 3, Role: head
Path-Set ID: 0x90000003
OutLabel : Serial2/0.14, 19
Next Hop : 10.1.14.4
FRR OutLabel : Tunnel65437, 19
Explicit Route: 10.1.14.4 10.1.4.4
Record Route (Path): NONE
Record Route (Resv): 10.1.4.4(19)

In the output above notice that that Sub-LSPs to R2 and R3 share the same label and outgoing interface, meaning that R1 has merged the two Sub-LSPs. The output also shows that every Sub-LSP is protected by means of FRR NHOP tunnel. You may notice explicit paths for every Sub-LSP that have been calculated at the headed based on the area topology and constrains. Next, check the multicast routing table at R1 to make sure the M-LSP is used as an outgoing interface for multicast traffic.

R1#show ip mroute 232.1.1.1 
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.1, 232.1.1.1), 00:59:57/stopped, flags: sTI
Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0
Outgoing interface list:
Tunnel0, Forward/Sparse-Dense, 00:58:17/00:01:42

R1#show ip mfib 232.1.1.1
Entry Flags: C - Directly Connected, S - Signal, IA - Inherit A flag,
ET - Data Rate Exceeds Threshold, K - Keepalive
DDE - Data Driven Event, HW - Hardware Installed
I/O Item Flags: IC - Internal Copy, NP - Not platform switched,
NS - Negate Signalling, SP - Signal Present,
A - Accept, F - Forward, RA - MRIB Accept, RF - MRIB Forward
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per second
Other counts: Total/RPF failed/Other drops
I/O Item Counts: FS Pkt Count/PS Pkt Count
Default
(172.16.1.1,232.1.1.1) Flags:
SW Forwarding: 0/0/0/0, Other: 0/0/0
FastEthernet0/0 Flags: A
Tunnel0 Flags: F NS
Pkts: 0/0

Step 2

Look at R2, which is a bud router in this scenario. It acts an a middle-point for the Sub-LSP from R1 to R2 and tail-end for the Sub-LSP to R2. The output below demonstrates this behavior:

R2#show mpls traffic-eng tunnels dest-mode p2mp 

P2MP TUNNELS:

P2MP SUB-LSPS:

LSP: Source: 10.1.1.1, TunID: 0, LSPID: 28
P2MP ID: 0, Subgroup Originator: 10.1.1.1
Name: R1 TO R2, R3, R4
Bandwidth: 5000, Global Pool

Sub-LSP to 10.1.2.2, P2MP Subgroup ID: 1, Role: tail
Path-Set ID: 0x8000003
InLabel : Serial2/0.12, 19
Prev Hop : 10.1.12.1
OutLabel : -
Explicit Route: NONE
Record Route (Path): NONE
Record Route (Resv): NONE

Sub-LSP to 10.1.3.3, P2MP Subgroup ID: 2, Role: midpoint
Path-Set ID: 0x8000003
InLabel : Serial2/0.12, 19
Prev Hop : 10.1.12.1
OutLabel : Serial2/0.23, 20
Next Hop : 10.1.23.3
FRR OutLabel : Tunnel65437, 20
Explicit Route: 10.1.23.3 10.1.3.3
Record Route (Path): NONE
Record Route (Resv): 10.1.3.3(20)

The output shows that Sub-LSP to R2 terminates locally and the Sub-LSP to R3 is label switched using the incoming label 19 and outgoing label 20. This ensures that incoming packets are decapsulated locally and label-switched in parallel. The output below demonstrates that the source 172.16.1.1 is seen on R2 as being connected to LSPVIF0 by the virtue of static multicast route configured. The received packets are replicated out of the connected FastEthernet interface.

R2#show ip mroute 232.1.1.1
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,
T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,
Y - Joined MDT-data group, y - Sending to MDT-data group,
V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode

(172.16.1.1, 232.1.1.1), 01:08:31/00:02:37, flags: sLTI
Incoming interface: Lspvif0, RPF nbr 10.1.1.1, Mroute
Outgoing interface list:
FastEthernet0/0, Forward/Sparse-Dense, 01:08:29/00:02:37

R2#show ip mfib 232.1.1.1
Entry Flags: C - Directly Connected, S - Signal, IA - Inherit A flag,
ET - Data Rate Exceeds Threshold, K - Keepalive
DDE - Data Driven Event, HW - Hardware Installed
I/O Item Flags: IC - Internal Copy, NP - Not platform switched,
NS - Negate Signalling, SP - Signal Present,
A - Accept, F - Forward, RA - MRIB Accept, RF - MRIB Forward
Forwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per second
Other counts: Total/RPF failed/Other drops
I/O Item Counts: FS Pkt Count/PS Pkt Count
Default
(172.16.1.1,232.1.1.1) Flags:
SW Forwarding: 0/0/0/0, Other: 0/0/0
Lspvif0, LSM/0 Flags: A
FastEthernet0/0 Flags: F IC NS
Pkts: 0/0

Step 3

Now it’s time to run a real test. We need a source connected to R1, as if we source process-switched ICMP packets off R1 they will not be MPLS encapsulated. This seems to be a behavior of the newly implemented MFIB-based multicast switching. Notice that the source uses the IP address of 172.16.1.1 for proper SSM multicast flooding:

R7#ping 232.1.1.1 repeat 100

Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 232.1.1.1, timeout is 2 seconds:

Reply to request 0 from 172.16.4.1, 80 ms
Reply to request 0 from 172.16.3.1, 108 ms
Reply to request 1 from 172.16.4.1, 32 ms
Reply to request 1 from 172.16.3.1, 44 ms
Reply to request 1 from 172.16.2.1, 40 ms

If you enable the following debugging on R2 prior to running the pings, you will see the output similar to the one below:

R2#debug ip mfib ps
MFIB IPv4 ps debugging enabled for default IPv4 table

R2#debug mpls packet
Packet debugging is on

MPLS turbo: Se2/0.12: rx: Len 108 Stack {19 0 254} - ipv4 data
MPLS les: Se2/0.12: rx: Len 108 Stack {19 0 254} - ipv4 data
MPLS les: Se2/0.23: tx: Len 108 Stack {20 0 253} - ipv4 data

MFIBv4(0x0): Receive (172.16.1.1,232.1.1.1) from Lspvif0 (FS): hlen 5 prot 1 len 100 ttl 253 frag 0x0
FIBfwd-proc: Default:224.0.0.0/4 multicast entry
FIBipv4-packet-proc: packet routing failed

Per the output above, the packet labeled with label 19 is received and switched out with the label 20, which is in accordance with the output we saw when checking the M-LSP tunnel. Also, you can see MFIB output demonstrating that a multicast packet was received out of LSPVIF interface. The forwarding fails as there are no outgoing encapsulation entries, but the router itself responds to the ICMP echo packet.

Summary

We reviewed the concept of M-LSP – how it evolved from the tag-switched multicast to a generic technology suitable to transport of multipoint traffic flows, not limited to multicast only. We briefly outlined the two main signaling methods for establishing M-LSPs – the multipoint LDP and the RSVP-TE extensions for P2MP LSPs. Both approaches have their benefits and drawbacks, but only the RSVP-TE method has been widely deployed so far and recently offered to public by Cisco. Both methods are still evolving toward the support of broader application sets, such as multicast VPNs and VPLS.

Further Reading

A lot of documents have been mentioned in the document itself. They are now summarized below along with some additional reading.

Tag Switching Architecture Overview (Historical)
Partitioning Tag Space among Multicast Routers on a Common Subnet (Historical)
Multipoint Signaling Extensions to LDP
MPLS Upstream Label Assignment and Context Specific Label Space
Multicast Multipath
Multicasting in VPLS
RFC4875 RSVP-TE Extensions for P2MP LSPS
MPLS TE Auto-Tunnels
MPLS Point-to-Multipoint Traffic Engineering
Using Multipoint LSPs for IP-TV Deployments
Inter-AS mVPNs: MDT SAFI, BGP Connector & RPF Proxy Vector

Subscribe to INE Blog Updates

New Blog Posts!