Posts Tagged ‘stp’

Mar
14

We are going to discuss Cisco’s proprietary extensions to STP algorithm, namely UplinkFast and BackBone Fast. Those two features aim to reduce the time it takes STP to re-activate topology after a link failure. While UplinkFast seems pretty intuitive, BB Fast is more complicated.

See more detailed overview at: http://blog.ine.com/wp-content/uploads/2010/04/understanding-stp-rstp-convergence.pdf

UplinkFast

stp-convergence212

Continue Reading

Tags: , , , , , ,

Mar
07

In this post we are going to look into STP convergence process. Many people have perfect understanding of STP, but yet face difficulties when they see questions like “How many seconds will it take for STP to recover connectivity if a given link fails?”. The post will follow the outline below:

1) General overview of STP convergence process
2) How STP converges if a directly connected link fails
3) How STP converges when it detects indirect link failure
4) Topology changes and their effect

See more detailed overview at: http://blog.ine.com/wp-content/uploads/2010/04/understanding-stp-rstp-convergence.pdf

Continue Reading

Tags: , , , , ,

Sep
24

Note: You may want to read a newer blog post on MSTP here Understanding MSTP

This post continues the previous article dedicated to MSTP operations inside a single region. Before reading any further, make sure you read and fully understand the first part of MSTP overview: MSTP Tutorial Part I: Inside a Region. The information there is critical to understand the new post.

The concept of CIST

As you remember, every MSTP region runs special instance of spanning-tree known as IST or Internal Spanning Tree (=MSTI0). This instance is active on all links inside a region and serves the purpose of disseminating STP topology information for other STP instances. As usual, IST has a root bridge, elected based on the lowest bridge ID (priority/MAC address). However, situation changes when you have different regions in the network (e.g. switches with different region names, different revisions, etc). When a switch detects BPDU messages sourced from another region on any link, it marks the link as MSTP boundary. On the figure below you can see two MSTP regions connected via two separate links (disregard the RSTP region for now).

Now two regions should build a common spanning tree known as CIST – Common and Internal Spanning Tree. This tree is result of joining ISTs of each region in a special manner. Here is a detailed description of the process.

1) In addition to sending IST configuration BPDUs, every switch initially declares itself as the root of CIST. The switches pass CIST configuration information along with IST information in additional BPDU fields. Switches inside a region never change the path cost to the CIST Root, known as CIST External Root Path Cost. Instead of that, the external path cost only changes on the boundary ports. Thus, this external cost only accounts for the cost of boundary links, not the cost of the paths inside a region. Essentially, CIST External Path Cost information “tunnels” across a region.

2) On boundary ports, switches exchange their CIST BPDU information ONLY. That is, switches hide IST information between regions, but pass CIST metrics. The usual RSTP synchronization process takes places between switches on a border link, and eventually ONE switch with the lowest BID (Bridge ID = Priority + MAC Address) among all regions is elected as the CIST Root. Note that each region still elects a local IST root, known as CIST Regional Root, as described further.

3) The region that contains the CIST Root, declares this switch as the root of local IST as well. However, things start to differ for regions that don’t contain the CIST root. All of them elect one of the border switches (switches connected to other regions) as IST root (aka CIST Regional Root). This procedure elects the IST Root (CIST Regional Root) based on the lowest CIST External Root Path Cost. Note that this procedure differs from elections based purely on BID, which take place inside a single region. In this case, the procedure uses BIDs as tiebreaker, if two or more switches have the same CIST Root External Path Cost. MSTP blocks all redundant boundary “uplinks” marking them as “alternate” paths to the CIST Root. The boundary switches do so by receiving “extra” CIST BPDUs on top of IST BPDUs with external root path cost values and comparing them with the ones received on local boundary ports.

4) Inside a region, switches build regular IST, using the CIST Regional Root as the root of IST. Note that this tree uses so-called Internal Root Path Cost stored in the local IST BPDUs. This cost increments along all the links inside a region, but it never leaks out of the region. Between regions, switches exchange information about CIST External Root Path cost only.

Note that switch with non-optimal priority value may become the CIST Regional Root (local IST Root). For example, if you configure a switch inside a region with a lowest BID among all switches it may not necessarily become the CIST Regional Root. Only if the switch has the lowest BID among all regions it would be elected as CIST Root.

The concept of CST and STP interoperation

From the above information, we conclude that CIST essentially has organization of a two-level hierarchy. The first level treats all regions as “pseudo-bridges” and operates with the External Root Path Cost. The first-level spanning tree roots in CIST Root Bridge and encompasses the pseudo-bridges. They call it CST or Common Spanning Tree. Effectively, this has no idea of the internal MSTP regions structure, but sees each region as a virtual bridge.

This is the point where MSTP interoperates with legacy IEEE STP/RSTP regions as well. The legacy switch regions have no concept of IST, so they simply join their STP instance with the CST and perceive MSTP regions as “transparent” pseudo-bridges, staying unaware of their internal topology. (Note that it may happen so that a switch with the lowest BID belongs to RSTP/STP region. This situation results in all MSTP regions electing local CIST Regional Roots and considering the new CIST Root located outside MSTP “domain”). Naturally, MSTP detects the appropriate STP version on a boundary link and switches to the respective mode of operations (e.g. RSTP/STP).

The second level of CIST hierarchy consists of various regional ISTs. Every MSTP region builds IST instance using the internal path costs and following the optimal “internal” topology. As you remember, this “internal topology” transparently transports the “external” BPDU information to the border switches, so that they can elect Regional CIST Root. The fist level topology (CST) is somewhat independent of the second level topology (IST) and bases upon the CIST root and cost of the boundary links. The second level topology (“internal”, IST) may change in case if boundary link cost changes or something happens to the CIST Root, since this affect the election of the Regional CIST Root.

MSTP pseudo-bridges do not strictly emulate the real bridges. For example, different boundary switches send their own BID in BPDUs, so the pseudo-bridge may appear to have many BIDs on various boundary links. However, this has no impact on the process of transparent tunneling of CIST Root information across the pseudo-bridge. Other things that don’t see pseudo-bridge as non-transparent include MSTP hop count or MaxAge timer value, for they may change asynchronously, as information travels along different paths inside a region.

MSTIs and CIST

Now, what could be said about the MSTIs – individual STP instances used inside regions? From what we learned so far, it is easy to conclude that the only logical solution is to map all MSTP instances to the CIST on the boundary links. This implies that you cannot load-balance VLAN traffic on the boundary links by mapping VLANs to different instances. All VLANs use the same non-blocking uplink that CIST elected as the optimal path to the CIST Root. But this only applies to the “CST” paths connecting the regional virtual bridges – inside any region VLANs follow the internal topology paths, based on the respective MSTI configurations.

It is important to note that MSTIs have no idea of the CIST Root whatsoever; they only use internal paths and internal MSTI root to build the spanning tree. However, all MSTP instances see the root port (towards the CIST Root) of the CIST Regional Bridge as the special “Master Port” connecting them to the “outside” world. This port serves the purpose of the “gateway” linking MSTI’s to other regions. Notice that switches do not send M-records (MSTI information) out of boundary ports, only CIST information and thus . Thus, the CIST and MSTI’s may converge independently and in parallel. The master port will ony beging forwarding when all respective MSTI ports are in sync and forwarding to avoid temporary bridging loops.

MSTP and Fault Isolation

Ethernet is known for its broadcast nature that tends propagating issues across the whole Layer 2 domain. There are tree main problems with Ethernet that affect MSTP designs:

  • Unknown unicast flooding results in traffic surges under topology changes. Every topology change may cause massive invalidation of MAC address tables and unicast traffic flooding. This process is the result of Ethernet topology unawareness – the bridges don’t know MAC addresses location.
  • Broadcas and Multicast flooding. This is a separate problem as many core protocols (ARP, IGP, PIM) rely on multicasting or broadcasting. Thos packets should be delivered to every node in a broadcast domain and under intense load network could be congested at every point.
  • Spanning-Tree Convergence. MSTP uses RSTP procedure for STP re-negotiation. Since it is based on distance-vector behavior, it is prone to convergence issue, such as counting to infinity (old information circulation). This is especially noticeable in larger topologies with 10 switches and more and under special conditions, such as failure of the root bridge.

The concept of MSTP region allows for bounding STP re-computations. Since MSTIs in every region are independent, any change affecting MSTI in one region will not affect MSTIs in other regions. This is a direct result of the fact that M-record information is not exchanged between the regions. However, CIST recalculations affect every region and might be slow converging. This is why it is a good idea not to map any VLAN to CIST.

Topology changes in MSTP are treated the same way as in RSTP. That is, only non-edge links going to forwarding state will cause a topology change. A single physical link may be forwarding for one MSTI and blocking for another. Thus, a single physical change may have different effect on MSTIs and the CIST. Topology changes in MSTIs are bounded to a single region, while topology changes to the CIST propagate through all regions. Every region treats the TC notification from another region as “external” and applies them to CIST-associated ports only.

A topology change to CST (the tree connecting the virtual bridges) will affect all MSTIs in all regions and the CIST. This is due to the fact that new link becoming forwarding between the virtual bridges may change all paths in the topology and thus require massive MAC address re-learning. Thus, from the standpoint of topology change, something happening to the CST will have most massive impact of flooding in the set of interconnected MSTP regions.

The above observations advise a good design rule for MSTP networks – separated “meshy” topologies in their own regions and interconnect regions using “sparse” mesh, keeping in mind balance between redundancy and topology changes effect. This is an adaptation of well-know design principle – separate complexity from complexity to keep networks more stable and isolate fault domains.

Interoperating with PVST+

This task poses a tough issue. We know that PSVST+ runs an STP instance for every VLAN. On a contrary, MSTP maps VLANs to MSTIs, so one-to-one mapping between VLAN and STP instance no longer holds true. How should an MSTP switch operate on a border link connected to the PVST+ domain? As we remember, on the border with IEEE STP domain, PVST+ simply joins VLAN 1 STP with IEEE STP and tunnels SSTP BPDUs across the IEEE STP domain (refer to PVST+ Explained article for more information).

However, MSTP runs multiple MSTIs inside a region and maps them all to CIST on the border link. That means we need to make sure that internal MSTIs could be aware of changes in PVST+ trees. It’s hard to automatically map VLAN-based STPs to MSTI and so the simplest way to accomplish the desired behavior is to join all PVST+ trees with CIST. This way, changes in any of PVST+ STP instances propagate to CIST/IST and affect all MSTIs in result. While not the optimal solution, it ensure that no changes go unnoticed and no black holes occur in a single VLAN due to the topology changes.

The MSTP implementation simulates PVST+ by replicating CIST BPDUs on the link facing the PVST+ domain and sending the BPDUs on ALL VLANs active on the trunk. The MSTP switch consumes all BDPUs received from PVST+ domain and processes them using the CIST/IST instance. The PSVT+ side sees the MSTP domain as a special PVST+ domain with all per-VLAN instances claiming the CIST Root as the root of their STP. Note that PVST+ also interprets the whole region as a single pseudo-bridge, but operating in PSVT+ mode. The two possible options are allowed here:

1) MSTP domain (either a single region or multiple regions) contains the root bridge for ALL VLANs. This is only true if CIST Root BID is better than any PVST+ STP root BID. This is the preferred design, for you can manipulate uplink costs on the PVST+ side and obtain optimal traffic engineering results.

2) PVST+ contains the root bridges for ALL VLANs, including VLAN1, which maps to CST of STP. This is only true is all PVST+ root bridges BIDs for all VLANs are better than CIST Root BID. This is not the preferred design, since all MSTIs map to CIST on the border link, and you cannot load-balance the MSTIs as the enter the PVST+ domain.

Cisco implementation does not support the second option. MSTP domain should contain the bridge with the best BID, to ensure that the CIST Root is also the root for all PVST+ trees. If any other case, MSTP border switch will complain and place the ports that receive superior BPDUs from PVST+ region in root-inconsistent state. To fix this issue, ensure that PVST+ domain does not have any bridges with BIDs better than the CIST Root Bridge ID.

Scenario 1: CIST Root and CIST Regional Root

In this scenario, we configure four switches in two regions. The first region consists of one switch (SW1) and the second region consists of three switches: SW2, SW3 and SW4. SW1 is the CIST root thanks to its lowest priority. SW2 is the CIST Regional Root. We also demonstrate some path manipulation options.

SW1:
spanning-tree mode mst
spanning-tree mst 0 priority 4096
!
spanning-tree mst configuration
 name REGION1
 exit

SW2:
spanning-tree mode mst
spanning-tree mst configuration
 name REGION234
 exit
!
interface range FastEthernet 0/13 - 14
 spanning-tree mst 0 cost 100

SW3:
spanning-tree mode mst
spanning-tree mst configuration
 name REGION234
 exit
!
interface FastEthernet 0/19
 spanning-tree mst 0 cost 100

SW4:
spanning-tree mode mst
spanning-tree mst configuration
 name REGION234
 exit
spanning-tree mst 0 priority 16384
!
interface FastEthernet 0/16
 spanning-tree mst 0 cost 100

In the above configuration we adjusted some link costs to ensure that

1) MSTP elects SW2 as the CIST Regional root due to its shortest path to the CIST Root. SW2 is the CIST Regional Root for REGION234, even though SW4 has better switch priority.
2) SW3 selects the path through SW4 to reach internal root bridge due to the shortest path cost via SW4.

Verifications:

SW2#show spanning-tree mst 0

##### MST0    vlans mapped:   1-4094
Bridge        address 0019.564c.c580  priority      32768 (32768 sysid 0)
Root          address 0019.55e6.d380  priority      4096  (4096 sysid 0)
              port    Fa0/13          path cost     100
Regional Root this switch
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Root FWD 100       128.15   P2p Bound(RSTP)
Fa0/14           Altn BLK 100       128.16   P2p Bound(RSTP)
Fa0/16           Desg FWD 200000    128.18   P2p
Fa0/19           Desg FWD 200000    128.21   P2p

Note the following: SW2 is the Root Bridge (CIST Root) with priority 4096. The root port is Fa 0/13, elected based on regular STP rules (shortest path, lowest upstream port priority). The other uplink is blocking. Both ports show up as “Boundary” since they face the other MSTP domain. Note that SW2 is the Regional Root due to the fact that is has the shortest path to the CIST Root.

Now look at SW4. This switch has lower priority than SW2, but it is not elected as the CIST Regional Root, for SW4 path cost to the CIST Root is worse

SW4#show spanning-tree mst 0

##### MST0    vlans mapped:   1-4094
Bridge        address 000b.5f02.1100  priority      16384 (16384 sysid 0)
Root          address 0019.55e6.d380  priority      4096  (4096 sysid 0)
              port    Fa0/16          path cost     100
Regional Root address 0019.564c.c580  priority      32768 (32768 sysid 0)
                                      internal cost 100       rem hops 19
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Altn BLK 200000    128.13   P2p Bound(RSTP)
Fa0/16           Root FWD 100       128.16   P2p
Fa0/19           Desg FWD 200000    128.19   P2p

Note the following in the above output. First, the local switch has priority value of 16834 and the CIST Root has priority value of 4096. Next the path cost to CIST Root is 100 – the same value that SW2 (the CIST Regional Root) has. This is a direct consequence from the fact that CIST External Root Path cost never increments along the internal paths. The CIST Regional Root is SW2 with the default priority, and the internal path cost is 100 – just as we set it. The boundary port is Fa 0/13 and it is blocking (we are not the CIST Regional Root). The port connected to SW2 is out root port (internal root port). Finally inspect the show output on SW3:

SW3#show spanning-tree mst 0

##### MST0    vlans mapped:   1-4094
Bridge        address 000d.bd27.de00  priority      32768 (32768 sysid 0)
Root          address 0019.55e6.d380  priority      4096  (4096 sysid 0)
              port    Fa0/19          path cost     100
Regional Root address 0019.564c.c580  priority      32768 (32768 sysid 0)
                                      internal cost 200       rem hops 18
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/16           Altn BLK 200000    128.16   P2p
Fa0/19           Root FWD 100       128.19   P2p

Again, we see that SW1 is the CIST Root with the External Root Path Cost of 100 (it never changes inside the region). SW2 is the CIST Regional Root with the internal path cost of 200 (across SW4). The link to SW2 is the root port of SW4.

Scenario 2: MSTIs and the Master Port

In this scenario we adds another STP instance to REGION234. We also add two VLANs 10 and 20 to all switches and map them to MSTI 1 in REGION234.

SW1:
vlan 10,20

SW2, SW3 & SW4:
vlan 10,20
!
spanning-tree mst configuration
 instance 1 vlan 10, 20

Let’s see the show command output in SW2.

SW2#show spanning-tree mst 1

##### MST1    vlans mapped:   10,20
Bridge        address 0019.564c.c580  priority      32769 (32768 sysid 1)
Root          address 000b.5f02.1100  priority      32769 (32768 sysid 1)
              port    Fa0/19          cost          200000    rem hops 19

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Mstr FWD 200000    128.15   P2p Bound(RSTP)
Fa0/14           Altn BLK 200000    128.16   P2p Bound(RSTP)
Fa0/16           Altn BLK 200000    128.18   P2p
Fa0/19           Root FWD 200000    128.21   P2p

There is no Regional Root here. Just regular root, which is the root of MSTI, elected using the regular STP rules. In our case, the root is SW4 and the root port is Fa 0/19. Next note the special “Master Port”. This is the uplink port of CIST Regional Root. All MSTIs map to CIST here, and follow the single path. This port is also forwarding and provides the path upstream to the CIST Root for all MSTIs and their VLANs.

Scenario 3: PVST+ and MSTP interoperation

In this scenario, we configure SW1 to use PSVT+ and see how it interworks with MSTP. First, configure so that SW1 is the root bridge for all VLANs and that it beats all bridges in MSTP domain.

SW1:
spanning-tree mode pvst
spanning-tree vlan 1-4094 priority 8192

Let’s see what happens on SW2:

%SPANTREE-2-PVSTSIM_FAIL: Superior PVST BPDU received on VLAN 

SW2#show spanning-tree mst 0

##### MST0    vlans mapped:   1-9,11-19,21-4094
Bridge        address 0019.564c.c580  priority      32768 (32768 sysid 0)
Root          address 0019.55e6.d380  priority      4097  (4096 sysid 1)
              port    Fa0/13          path cost     100
Regional Root this switch
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Root BKN*100       128.15   P2p Bound(PVST) *PVST_Inc 
Fa0/14           Altn BLK 100       128.16   P2p Bound(PVST)
Fa0/16           Desg FWD 200000    128.18   P2p
Fa0/19           Desg FWD 200000    128.21   P2p

First note the syslog message in the beginning. It says that while emulating PVST+ the MSTP code encountered the situation where PVST+ domain claims itself as a root for one or more VLANs. This implies PVST+ bridge having better BID than the current CIST Root. As a result, even though the MSTP considers the new root as the “legitimate” new CIST Root, it blocks the uplink port as PVST Simulation Inconsistent. This process results in interesting situation – even though the PVST+ domain contains the CIST Root, the MSTP domain cannot reach it. In effect, potential loops are eliminated, but the MSTP domain loses connectivity to the PVST+ domain. The same situation could be observed with the MSTI 1:

SW2#show spanning-tree mst 1

##### MST1    vlans mapped:   10,20
Bridge        address 0019.564c.c580  priority      32769 (32768 sysid 1)
Root          address 000b.5f02.1100  priority      32769 (32768 sysid 1)
              port    Fa0/19          cost          200000    rem hops 19

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Mstr BKN*200000    128.15   P2p Bound(PVST) *PVST_Inc 
Fa0/14           Altn BLK 200000    128.16   P2p Bound(PVST)
Fa0/16           Altn BLK 200000    128.18   P2p
Fa0/19           Root FWD 200000    128.21   P2p

Here we see that the “Master Port” is blocked due to the PSVT simulation inconsistency. To resolve this issue you need to ensure that MSTP domain contains the root bridge for all PVST+ trees. This is accomplished by tuning priority value for IST to a number lower than any PVST+ bridge priority.

SW1:
spanning-tree mode pvst
spanning-tree vlan 1-4094 priority 8192

SW2:
spanning-tree mst 0 priority 4096

Now SW2 is the new CIST Root. Look at the show command output now:

SW2#show spanning-tree mst 0

##### MST0    vlans mapped:   1-9,11-19,21-4094
Bridge        address 0019.564c.c580  priority      4096  (4096 sysid 0)
Root          this switch for the CIST
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg LRN 100       128.15   P2p Bound(PVST)
Fa0/14           Desg BLK 100       128.16   P2p Bound(PVST)
Fa0/16           Desg FWD 200000    128.18   P2p
Fa0/19           Desg FWD 200000    128.21   P2p 

SW2#show spanning-tree mst 1

##### MST1    vlans mapped:   10,20
Bridge        address 0019.564c.c580  priority      32769 (32768 sysid 1)
Root          address 000b.5f02.1100  priority      32769 (32768 sysid 1)
              port    Fa0/19          cost          200000    rem hops 19

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg FWD 200000    128.15   P2p Bound(PVST)
Fa0/14           Desg FWD 200000    128.16   P2p Bound(PVST)
Fa0/16           Altn BLK 200000    128.18   P2p
Fa0/19           Root FWD 200000    128.21   P2p

All SW2 ports are now designated. It correctly emulates the PVST+ interactions, and SW1 sees SW2 as the root of all PVST+ instances. SW1 will then block one of its redundant uplink based on the regular STP rules. In this situation, traffic may flow between the PVST+ and MSTP domains and you can achieve optimal load-balancing using the PSVT+ cost tuning on SW1.

Tags: , , , , , ,

Aug
07

For the sake of simplicity and enabling a wider audience we decided to post our regular CCIE brainteasers to the blog.  The winner will get a coupon worth 10% off the price of any of our training packages for R&S, Security, Voice or Service Provider or a $250 Amazon.com gift card! Note that the 10% off discount can not be used with any other discount code you may already have. Please post your solution under the comments for this blog entry – the first person to post the correct solution is the winner. Make sure you provide the correct email address in your response so we can contact you in the event you won.  On Tuesday (August 12th) we will post the solution and announce the winner.

For today the task is an easy one or at least appears to be ;-) Imagine a simple topology made of 3 switches:

STP topology

All switches are running STP for VLAN123 with SW3 being the root.  Your task is to configure the network in such a way so that SW1 port fa0/13 is the root port and SW1 port fa0/16 is the alternate port for VLAN 123.  Sound easy?  Here are the requirements:

1) Do not change any STP link cost

2) SW3 must remain the root for VLAN 123

3) The port types must be access

4) Do not use the switchport backup interface command

5) Do not try to use SPAN or RSPAN

6) Do not disable STP

Good luck!

The correct solution is:

1) Configure SW2 to tunnel STP BPDUs between SW1 and SW3. This will make SW1 thinking that that SW3 is directly connected with cost 19. STP is still active on SW2, but SW2 considers itself the root.

SW2:
interface FastEthernet 0/13
l2protocol-tunnel stp
!
interface FastEthernet 0/16
l2protocol-tunnel stp

2) Configure SW3 port Fa0/16 with lower STP priority than SW3 Fa 0/13. This will make SW1 select its connection to SW2 as the root port and the other uplink is alternate: both uplinks have equal costs, the upstream port priority is the tiebreaker.

SW3:
interface FastEthernet 0/16
spanning-tree port-priority 64

Below is a summarization of some of the close but not quite correct approaches people submitted:

1) Change interface bandwidth/speeds. This is not allowed, since the requirement was not to change spanning-tree costs.

2) Use dot1q tunnel on SW2 – this was prohibited by requirement to set port modes to access

3) Filter spanning-tree BPDUs coming to SW1 from SW3. This would break the requirement for Fa 0/16 port to be alternate path to root. Aside from that, that would result in STP loop, since this is a circular topology.

4) Disabling STP in SW2 explicitly which is prohibited by the requirements

5) Incorrectly assuming that port-priority on SW1 may influence root port selection

6) One complicated MSTP solution submitted by two people actually works but was submitted after the above solution was posted.  The solution is based on differentiation between regional root and CIST root.  Not the simplest solution but it works.  The two people that posted this solution also deserve credit for their MSTP knowledge.  We’ll do a post on MSTP inter-region operations here on the blog in the next few days.

The winner is: “Roman”
 roman.aprias@[snip].com

Tags: , , ,

Jul
27

Note: You may want to read a newer blog post on MSTP here Understanding MSTP

Before we begin with MSTP (Multiple Spanning Trees Protocol), I would like to note that this tutorial is going to be is divided in two parts. The first part describes how MSTP works inside a single region (the definition of the term will follow later). The second part is dedicated to MSTP region interaction with other regions and different STP protocols (IEEE STP, RSTP and Cisco PVST+).

Historical Review

First, a short history tour. In the beginning, there was IEEE STP protocol (originally, there also was DEC variant [the original] invented by Radia Perlman and IBM STP protocols, but those are fossils now), which was adapted for use with multiple VLANs and 802.1q trunks. A single shared tree, sometimes called Mono Spanning Tree by Cisco, or more often – Common Spanning Tree is shared by all VLANs. The obvious drawback of this design is impossibility to perform VLAN traffic engineering across redundant links: if a link is blocked, it is blocked for all VLANs. To overcome this, Cisco suggested its proprietary PVST/PVST+ solution, running a separate STP instance for each VLAN. This solution permits using different logical topology for each VLAN, effectively allowing for L2 traffic engineering. However, with the number of VLANs growing, PVST becomes a waste of switch resources and management burden, for the number of logical topologies is usually much smaller than the number of active VLANs.

As time passed, STP evolved into RSTP and Cisco answered with Rapid-PVST+: the fast STP, but with the same per-VLAN instance concept. The single spanning-tree instance used by IEEE and per-VLAN STP implemented by Cisco represents two poles in the space of possible solutions. Seeing the limitations of PVST approach, Cisco came with idea of decoupling the STP instance from a VLAN (they were bound together in PVST). The initial implementation was called MISTP (Multiple Instances Spanning Tree) and later evolved into new IEEE 802.1s standard called MSTP (Multiple Spanning Trees Protocol). As we would see later, this evolution process led to some terminology confusion, and small features mismatch between IEEE MSTP and Cisco MSTP implementation.

Logical and Physical Topologies

Look at the diagrams below:

Physical and Logical Topologies

Cisco’s original proposal was as follows. Instead of running an STP instance for each VLAN, let’s run a number of VLAN-independent STP instances (representing logical topologies) and then map each VLAN to the most appropriate logical topology (instance). Thus, the number of STP instances is kept to minimum (saving switch resources), but the network capacity is utilized in optimal fashion, by using all possible paths for VLAN traffic. The switch forwarding logic for VLAN traffic was changed a little bit. In order for a frame to be forwarded out of a port, two conditions must be met: first, VLAN must be active on this port (e.g. not filtered) and second, the STP instance the VLAN maps to, must be in non-discarding state for this port. Obviously, due to multiple logical topologies a single port could be blocking for one instance and forwarding for another (note that in (R)PVST+ a port is either forwarding or discarding for a VLAN).

Mapping VLANs to Instances

Implementing MSTP

Now that the basic idea is understood, let’s think how it could be implemented. The following questions need to be answered:

  • Topology Calculation. How to build multiple STP instances (logical topologies) in a single physical topology? Should we run multiple STP instances each with own BPDUs? If yes, then how would we distinguish every instance’s BPDUs: PVST+ uses VLAN tags for that, but now STP instances are independent of VLANs?
  • Information Distribution.How to make all switches aware of VLAN to instance mappings? Should we distribute instance ID along with VLAN number? If yet, then how could we ensure all switches use consistent numbering?
  • Consistency Check. How to ensure the above mapping is consistent across all switches? That is, would switch1 know that switch2 maps VLAN2 to the same instance 1 and not insance2?

Original Cisco MISTP pre-standard implementation sends separate BPDUs for each instance – this allows for separate STP calculations. Each BDPU contains instance number and a list of VLANs, mapped on sending switch to this particular instance – this allows for consistency check. However, the table to map VLANs to instance numbers has to be configured on each switch separately – that is, there is no automated mechanism to distribution VLAN to instance mappings. This is what Cisco did originally, but the IEEE 802.1s standard implementation made this mechanics more elegant. Before we continue discussing IEEE’s implementation, let’s define MSTP region as a collection of switches, sharing the same view of physical topology partitioning into set of logical topologies. For two switches to become members of the same region, the following attributes must match:

  • Configuration name
  • Configuration revision number (16 bits)
  • The table of 4096 elements which map the respective VLAN to STP instance number

IEEE 802.1s implementation does not send a BDPU for each active STP instance, nor does it encapsulate VLAN list in each configuration message. Instead of that, a special STP instance number 0 called Internal Spanning Tree (IST or MSTI0) is designated to carry all “signaling” information. The BPDUs for IST contain all standard RSTP information for IST itself, as well as carry additional informational fields. Among others fields there are configuration name, revision number and a hash value computed over VLAN to STP instance mapping table contents. Using just this compact information it’s easy to detect misconfiguration on two neighboring switches.

What about other instances, besides the IST thing? Well, obviously, all VLANs could be mapped to IST – this is the default configuration. Effectively, this represents the case of classic IEEE RSTP with all VLANs sharing the same spanning-tree. Of course, other instances also exit, and they are called MSTIs – multiple spanning tree instances. Each MSTI may assign different priorities to switches, may have different link costs, port priorities and thus end up with it’s own logical topology. Now if the 802.1s standard implementation does not send separate BDPUs for each MSTI, how does it accomplish separate topologies? The MSTIs information is piggybacked into IST BPDUs in special MRecord fields (one for every active MSTI), which carries root priority, designated bridge priority, port priority and root path cost among others. Let’s see how this whole thing works.

First of all, since MSTP convergence mechanism stems from RSTP, there is no BDPU relaying process downstream from the root bridge. Every switch emits configuration BPDUs on it’s own, every Hello interval seconds. Every BDPU has full information about IST, and also MRecord for every MSTI . Using the RSTP convergence mechanics, separate STP instances are built for IST and every MSTI, using the information from IST BPDU and MRecords (root/designated bridge priorities, port priority, root path cost etc). Note that STP timers such as Hello, ForwardTime, MaxAge could only be tuned for IST, the instance 0. All other instances (MSTIs) inherit the timers from IST – this is the natural result of all MSTI information being piggybacked in IST BPDUs. Just as a side note, MSTP does not use MaxAge timer to age out old information, like RSTP/STP do. Instead of this, IST BDPUs has special field called MaxHops. IST root sends BPDUs with hop count equal to MaxHops and every other downstream switch decrements the hop count field on reception of IST BPDU. As soon as hop count becomes zero, the information in BPDU is ignored, and the switch may start declaring itself as new IST root. The old MaxAge/ForwardDelay timers are still used when MSTP interacts with RSTP, STP or (R)PVST+ bridges.

Caveats arising from VLAN/STP decoupling

Before we jump to configuration examples, let’s consider some issues, which may arise from the fact that spanning-tree instances now are not directly tied to VLANs. The general rule should be as following: “If a VLAN is active on a particular primary link (e.g. this link is non-backup in your logical topology), ensure the STP instance it maps to is forwarding on this link”. Consider the following example:

MSTP Broken Config 1

In this topology, VLANs are manually pruned on trunks. Since the filtering is not consistent with the respective MSTI blocking decisions, VLAN2 traffic is blocked between SW1 and SW2. To avoid this situation, do not use “VLAN pruning” static method of distributing VLANs across trunks when you have MSTP enabled.

Another classic example, which may arise when you map VLANs to default IST:

MSTP Broken Config 2

Since there is just one STP instance, it blocks one of the ports. Unfortunately, this is the only port that VLAN3 can use. To avoid such situations, use separate STP for each logical topology (e.g. MSTI1 and MSTI2 in this case for VLAN2/VLAN3) and avoid mapping VLANs to IST. Keep IST only for information distribution, but load-balance traffic using MSTIs.

Configuration Example

Now that we have basic understanding of how MSTP works inside a region, let’s jump to the configuration stage. Consider the following physical topology already mentioned above:

MSTP Config Sample

The topology has VLANs 1, 10,20,30,40,50,60. We want to achieve the following:

1) VLANs 10,20,30 should follow uplink from SW3 to SW1
2) VLANs 40,50,60 should follow uplink from SW3 to SW2
3) If any of the uplinks fail, the respective VLANs should use the other uplink

To accomplish this, we need to create two MSTIs – let’s give them numbers 1 and 2. SW1 will be the root for instance 1 and SW2 will be the root for instance 2. As for IST (MSTI0), let’s make SW3 the root switch for it (though it’s not recommended to assign root roles to access switches). VLAN 1 will remain mapped to IST, VLANs 10,20,30 to MSTI1, and VLANs 40,50,60 to MSTI2. Here is the configuration (pretty simple, inside a region):

SW1:
spanning-tree mode mst
!
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 1 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

SW2:
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 2 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

SW3:
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 0 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

Let’s review the effect of our configuration.

SW1#show spanning-tree mst configuration
Name      [REGION1]
Revision  0     Instances configured 3

Instance  Vlans mapped
--------  ---------------------------------------------------------------------
0         1-9,11-19,21-29,31-39,41-49,51-59,61-4094
1         10,20,30
2         40,50,60
-------------------------------------------------------------------------------
SW1#show spanning-tree mst               

##### MST0    vlans mapped:   1-9,11-19,21-29,31-39,41-49,51-59,61-4094
Bridge        address 0019.5684.3700  priority      32768 (32768 sysid 0)
Root          address 0012.d939.3700  priority      8192  (8192 sysid 0)
              port    Fa0/16          path cost     0
Regional Root address 0012.d939.3700  priority      8192  (8192 sysid 0)
                                      internal cost 200000    rem hops 19
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg FWD 200000    128.15   P2p
Fa0/16           Root FWD 200000    128.18   P2p 

##### MST1    vlans mapped:   10,20,30
Bridge        address 0019.5684.3700  priority      8193  (8192 sysid 1)
Root          this switch for MST1

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg FWD 200000    128.15   P2p
Fa0/16           Desg FWD 200000    128.18   P2p 

##### MST2    vlans mapped:   40,50,60
Bridge        address 0019.5684.3700  priority      32770 (32768 sysid 2)
Root          address 001e.bdaa.ba80  priority      8194  (8192 sysid 2)
              port    Fa0/13          cost          200000    rem hops 19

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Root FWD 200000    128.15   P2p
Fa0/16           Altn BLK 200000    128.18   P2p 

SW1#show spanning-tree mst interface fastEthernet 0/13

FastEthernet0/13 of MST0 is designated forwarding
Edge port: no             (default)        port guard : none        (default)
Link type: point-to-point (auto)           bpdu filter: disable     (default)
Boundary : internal                        bpdu guard : disable     (default)
Bpdus sent 561, received 544

Instance Role Sts Cost      Prio.Nbr Vlans mapped
-------- ---- --- --------- -------- -------------------------------
0        Desg FWD 200000    128.15   1-9,11-19,21-29,31-39,41-49,51-59
                                     61-4094
1        Desg FWD 200000    128.15   10,20,30
2        Root FWD 200000    128.15   40,50,60

SW1#show spanning-tree mst interface fastEthernet 0/16

FastEthernet0/16 of MST0 is root forwarding
Edge port: no             (default)        port guard : none        (default)
Link type: point-to-point (auto)           bpdu filter: disable     (default)
Boundary : internal                        bpdu guard : disable     (default)
Bpdus sent 550, received 1099

Instance Role Sts Cost      Prio.Nbr Vlans mapped
-------- ---- --- --------- -------- -------------------------------
0        Root FWD 200000    128.18   1-9,11-19,21-29,31-39,41-49,51-59
                                     61-4094
1        Desg FWD 200000    128.18   10,20,30
2        Altn BLK 200000    128.18   40,50,60

Note a few things here. The cost values are much higher than the default STP costs, and MSTIx is called MSTx (e.g. IST is MST0). Aside from that, note the term “Regional Root” which is to be explained in details in Part 2. As for this part, you can see that configuration MSTP inside a region is pretty simple. You just need to execute some caution, when filtering and mapping VLANs, but if you plan logical topologies in advance this should not cause any problems.

Tags: , , , , , , , ,

Jul
17

Cisco switches run different types of STP protocol, depending on whether the connected port is access, ISL trunk or 802.1q trunk. Natively, Cisco switch runs a separate STP instance for each configured and active VLAN (this is called Per-VLAN Spanning Tree or PVST) and standard IEEE compliant switches run just one instance of STP shared by all VLANs. Due to that reason, a group of switches running IEEE compatible STP protocol is called MST (Mono Spanning Tree) region, not to be confused with MSTP (multiple spanning tree).

Access Ports. Cisco switches run the IEEE version of STP protocol on the access ports. The IEEE STP BPDUs are sent to IEEE reserved multicast MAC address “0180.C200.0000” using IEEE 802.2 LLC SAP encapsulation with both SSAP and DSAP fields equal to “0×42”. (By the way, for the purpose of Layer2 filtering, IEEE BPDUs could be matched using a MAC ACL with LSAP value of ”0×4242”). Note that you can plug any standard IEEE compliant switch into a Cisco switch access port and they will inter-operate perfectly, joining the respective access VLAN’s STP instance with the IEEE STP instance (MST).

ISL Trunks. Cisco switches run PVST (Per-VLAN Spanning Tree) on ISL trunk links. The same IEEE STP BPDUs are sent for each VLAN, encapsulated with additional ISL header, which also carries the VLAN number. The magic part is that ISL header has special flag to distinguish frames carrying STP BPDUs and this is how PVST can re-use the regular IEEE BPDUs to simulate multiple spanning trees. Since PVST BPDUs have the same format as IEEE BPDUs (that is IEEE 802.2 LLC SAP) they can be matched using the same SSAP/DSAP values of “0×42” for the purpose of Layer 2 filtering.The group of Cisco switches connected using ISL trunks only is called PVST region.

802.1q Trunks. Cisco switches run PVST+ (Per VLAN Spanning Tree Plus) on 802.1q trunks. This is where things are getting a bit complicated. The goal of PVST+ is to interoperate with standard IEEE STP (MST) and allow transparent tunneling of PVST BPDUs across MST region to connect to other Cisco switches across the IEEE region. For further consideration, we’ll call a group of Cisco switches connected using 802.1q trunks as PVST+ region. Note that PVST+ region may connect to a PVST region using an ISL trunk and connect to MST region using a 802.1q trunk. The STP instances in PVST and PVST+ regions maps directly to each other, so no special interoperability solution is required. However, on IEEE side only one STP instance exists, contrary to multiple STP instances of PVST+ region. The first question is: if we want to interoperate with IEEE single tree, which PVST VLAN’s STP instance should be joined with MST? Cisco has chosen VLAN 1 for this purpose. Joined together, instances of Cisco VLAN 1 STP and MST are called “Common Spanning Tree” or CST (naturally, CST spans PVST, PVST+ and MST regions). As for the detailed PVST+ operations on 802.1q port, consider two following cases:

Case 1: Cisco switch connects to an IEEE switch using a 802.1q trunk with default native VLAN (VLAN 1)

IEEE side sends IEEE STP BPDUs to the IEEE multicast MAC address. These BPDUs are consumed and processed by VLAN 1 STP instance on Cisco switch (PVST+ region). The Cisco switch simply recognizes untagged IEEE BPDUs as belonging to VLAN 1 STP instance.

The PVST+ side (Cisco switch) sends IEEE STP BPDUs corresponding to local VLAN 1 STP to IEEE MAC address as untagged frames across the link. At the same time, special new SSTP (shared spanning tree, synonym to PVST+) BPDUs are being sent to SSTP multicast MAC address “0100.0ccc.cccd” also untagged. These SSTP BPDUs are encapsulated using IEEE 802.2 LLC SNAP header (SSAP=DSAP=”0xAA” and SNAP PID=”0x010B”). These BPDUs carry the same information as the parallel IEEE STP BPDUs for VLAN 1, but have additional fields, notably a special TLV with the source VLAN number. Note that IEEE switches do not interpret the SSTP BPDUs, but simply flood them through the respective VLAN topology. This facilitates BPDU tunneling in case there are other Cisco switches connected to IEEE cloud.

As for non-native VLANs (VLANs 2-4095) Cisco switch sends only SSTP BPDUs, tagged with respective VLAN number and destined to the SSTP MAC address. (Please remember that all SSTP BPDUs carry a VLAN number they belong to). The respective VLAN STP instances are “transparently expanded” across the MST region, considering it as a “virtual hub”. (Note that this may have some traffic engineering implications, since to non-CST VLANs the cost of traversing MSTP region equals to the cost of the link used to connect to the first MSTP switch).

Now the question is, why would Cisco switch send the same VLAN1 BPDU twice – destined to IEEE and SSTP multicast MAC addresses? Isn’t it supposed for the Cisco switch to join its VLAN 1 STP instance with the MST? The reason for sending additional SSTP BPDUs across VLAN 1 is purely informational, to perform consistency checking. The idea is to inform all other potential Cisco switches attached to MST cloud about our native VLAN. The receiving switch will only use IEEE BPDUs for VLAN 1 (CST) computations and will ignore SSTP BPDUs sent on VLAN 1.

For the purpose of layer 2 filtering, remember that you can match SSTP BPDUs using an ethertype value “0x010B”.This works with multilayer switches even though SSTP BPDUs are SNAP encapsulated, and the actual field is not “ethertype” but rather a SNAP Protocol ID.

Case 2: Cisco switch connects to IEEE switch using 802.1q trunk with non-default native VLAN (e.g VLAN 100).

IEEE switch sends IEEE STP BPDUs to IEEE multicast MAC address and those BPDUs are processed by VLAN 1 (CST) STP instance in the Cisco switch.

PVST+ side (Cisco switch) sends untagged IEEE STP BPDUs corresponding to VLAN 1 (CST) STP to IEEE MAC address across the link. This is done for the purpose of joining the local VLAN 1 instance and the IEEE instance into CST. At the same time, VLAN 1 BPDUs are replicated to SSTP multicast address, tagged with VLAN 1 number (to inform other Cisco switches that VLAN 1 is non-native on our switch). Finally, BPDUs of the native VLAN instance (VLAN 100 in our case) are sent untagged using SSTP encapsulation and destination address. Of course, native VLAN100 BPDUs, (even though they are untagged) carry VLAN number inside a special TLV SSTP header.

As in Case 1 for the remaining non-native VLANs (VLANs 2-4095) Cisco switch sends SSTP BPDU only, tagged with respective VLAN tag and destined to the SSTP MAC address. The other Cisco switches connected to the MSTP cloud receive the SSTP BPDUs and process the using the respective VLAN STP instances.

Consider the diagram below. SW1 and SW2 are Cisco switches, running STP instances for VLANs 1,2 and 3. Router R3 is converted into bridge (CRB mode and the same bridge-group on E0/0 and E0/1 interfaces, using IEEE STP protocol) to simulate an IEEE compatible switch. VLAN 1 STP instances of SW1 and SW2 are joined with STP instance running in R3. VLANs 2 and 3 consider path across R3 as another segment linking SW1 and SW2 and their SSTP information is passed transparently across R3.

pvst-plus-topology

The current situation in the topology is as following:

Rack1SW2#show spanning-tree vlan 1

 VLAN1 is executing the ieee compatible Spanning Tree protocol
  Bridge Identifier has priority 32768, address cc07.00c1.0000
  Configured hello time 2, max age 20, forward delay 15
  Current root has priority 32768, address cc02.00c0.0000
  Root port is 44 (FastEthernet1/3), cost of root path is 19
  Topology change flag set, detected flag not set
  Number of topology changes 12 last change occurred 00:00:39 ago
  Times:  hold 1, topology change 35, notification 2
          hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 44 (FastEthernet1/3) of VLAN1 is forwarding
   Port path cost 19, Port priority 128, Port Identifier 128.44.
   Designated root has priority 32768, address cc02.00c0.0000
   Designated bridge has priority 32768, address cc02.00c0.0000
   Designated port id is 128.5, designated path cost 0
   Timers: message age 2, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   BPDU: sent 7650, received 2445

 Port 56 (FastEthernet1/15) of VLAN1 is blocking
   Port path cost 19, Port priority 128, Port Identifier 128.56.
   Designated root has priority 32768, address cc02.00c0.0000
   Designated bridge has priority 32768, address cc06.00c0.0000
   Designated port id is 128.56, designated path cost 19
   Timers: message age 3, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   BPDU: sent 7, received 3827

Rack1SW2#show spanning-tree vlan 2

 VLAN2 is executing the ieee compatible Spanning Tree protocol
  Bridge Identifier has priority 8192, address cc07.00c1.0006
  Configured hello time 2, max age 20, forward delay 15
  Current root has priority 8192, address cc06.00c0.0006
  Root port is 44 (FastEthernet1/3), cost of root path is 19
  Topology change flag set, detected flag not set
  Number of topology changes 7 last change occurred 00:00:14 ago
          from FastEthernet1/15
  Times:  hold 1, topology change 35, notification 2
          hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 44 (FastEthernet1/3) of VLAN2 is forwarding
   Port path cost 19, Port priority 128, Port Identifier 128.44.
   Designated root has priority 8192, address cc06.00c0.0006
   Designated bridge has priority 8192, address cc06.00c0.0006
   Designated port id is 128.44, designated path cost 0
   Timers: message age 2, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   BPDU: sent 5889, received 372

 Port 56 (FastEthernet1/15) of VLAN2 is blocking
   Port path cost 19, Port priority 128, Port Identifier 128.56.
   Designated root has priority 8192, address cc06.00c0.0006
   Designated bridge has priority 8192, address cc06.00c0.0006
   Designated port id is 128.56, designated path cost 0
   Timers: message age 2, forward delay 0, hold 0
   Number of transitions to forwarding state: 1
   BPDU: sent 2, received 3821

Rack1SW2#show spanning-tree vlan 3

 VLAN3 is executing the ieee compatible Spanning Tree protocol
  Bridge Identifier has priority 32768, address cc07.00c1.0001
  Configured hello time 2, max age 20, forward delay 15
  Current root has priority 8192, address cc06.00c0.0007
  Root port is 44 (FastEthernet1/3), cost of root path is 19
  Topology change flag set, detected flag not set
  Number of topology changes 4 last change occurred 00:00:16 ago
          from FastEthernet1/15
  Times:  hold 1, topology change 35, notification 2
          hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 44 (FastEthernet1/3) of VLAN3 is forwarding
   Port path cost 19, Port priority 128, Port Identifier 128.44.
   Designated root has priority 8192, address cc06.00c0.0007
   Designated bridge has priority 8192, address cc06.00c0.0007
   Designated port id is 128.44, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   BPDU: sent 3689, received 332

 Port 56 (FastEthernet1/15) of VLAN3 is blocking
   Port path cost 19, Port priority 128, Port Identifier 128.56.
   Designated root has priority 8192, address cc06.00c0.0007
   Designated bridge has priority 8192, address cc06.00c0.0007
   Designated port id is 128.56, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   BPDU: sent 3, received 3832

R3 (the IEEE bridge) is the root for VLAN 1 (CST) and SW1 is the root for VLANs 2 and 3. Note that R3 has no idea about VLANs 2 and 3 and transparently floods SSTP BPDUs which could be seen on SW2 (non-root switch). As it should be, VLAN 1 BPDUs are received as IEEE STP BPDUs and SSTP copies are ignored.

Rack1SW2#debug spanning-tree bpdu
Spanning Tree BPDU debugging is on
Rack1SW2#
STP: VLAN1: config protocol = ieee, packet from FastEthernet1/3  , linktype IEEE_SPANNING , enctype 2, encsize 17
STP: enc 01 80 C2 00 00 00 CC 02 00 C0 00 01 00 26 42 42 03
STP: Data     00000000008000CC0200C00000000000008000CC0200C0000080050000140002000F00
STP: VLAN1 Fa1/3:0000 00 00 00 8000CC0200C00000 00000000 8000CC0200C00000 8005 0000 1400 0200 0F00

STP: VLAN1: config protocol = ieee, packet from FastEthernet1/15  , linktype IEEE_SPANNING , enctype 2, encsize 17
STP: enc 01 80 C2 00 00 00 CC 06 00 C0 F1 0F 00 26 42 42 03
STP: Data     00000000008000CC0200C00000000000138000CC0600C0000080380100140002000F00
STP: VLAN1 Fa1/15:0000 00 00 00 8000CC0200C00000 00000013 8000CC0600C00000 8038 0100 1400 0200 0F00

STP: VLAN3: config protocol = ieee, packet from FastEthernet1/3  , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 03 00 32 AA AA 03 00 00 0C 01 0B
STP: Data     00000000002000CC0600C00007000000002000CC0600C00007802C0000140002000F00
STP: VLAN3 Fa1/3:0000 00 00 00 2000CC0600C00007 00000000 2000CC0600C00007 802C 0000 1400 0200 0F00

STP: VLAN3: config protocol = ieee, packet from FastEthernet1/15  , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 0F 00 32 AA AA 03 00 00 0C 01 0B
STP: Data     00000000002000CC0600C00007000000002000CC0600C0000780380000140002000F00
STP: VLAN3 Fa1/15:0000 00 00 00 2000CC0600C00007 00000000 2000CC0600C00007 8038 0000 1400 0200 0F00

STP: VLAN2: config protocol = ieee, packet from FastEthernet1/3  , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 03 00 32 AA AA 03 00 00 0C 01 0B
STP: Data     00000000002000CC0600C00006000000002000CC0600C00006802C0000140002000F00
STP: VLAN2 Fa1/3:0000 00 00 00 2000CC0600C00006 00000000 2000CC0600C00006 802C 0000 1400 0200 0F00
STP: VLAN2: config protocol = ieee, packet from FastEthernet1/15  , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 0F 00 32 AA AA 03 00 00 0C 01 0B

Now let’s see how consistency checking is performed by PVST+.

Case 1: Change the native VLAN on SW1 connection to R3:

SW1:
interface FastEthernet 1/3
 switchport trunk native vlan 2

Rack1SW2#
%SPANTREE-2-RECV_PVID_ERR: Received BPDU with inconsistent peer vlan id 2 on FastEthernet1/3 VLAN1.

%SPANTREE-2-BLOCK_PVID_PEER: Blocking FastEthernet1/3 on VLAN2. Inconsistent peer vlan.PVST+: restarted the forward delay timer for FastEthernet1/3

%SPANTREE-2-BLOCK_PVID_LOCAL: Blocking FastEthernet1/3 on VLAN1. Inconsistent local vlan.PVST+: restarted the forward delay timer for FastEthernet1/3

Note that SW2 detects untagged packet with VLAN ID 2, which does not correspond to the locally configured default native VLAN 1. The corresponding port is put in «inconsistent» state. The reason SW2 detects this condition (and not SW1) is because SW1 sending SSTP BPDUs and SW2 is not (it receives superios BPDUs). As soon as native VLAN is converted back to «1» on SW1, consistency is restored:

%SPANTREE-2-UNBLOCK_CONSIST_PORT: Unblocking FastEthernet1/3 on VLAN2. Port consistency restored.

%SPANTREE-2-UNBLOCK_CONSIST_PORT: Unblocking FastEthernet1/3 on VLAN1. Port consistency restored. PVST+:Inconsistency timer expired. inconsistency 0
                 cleared for FastEthernet1/3
 PVST+:Inconsistency timer expired. inconsistency 0
                 cleared for FastEthernet1/3

It is also possible to restore consistency by making VLAN 2 native on SW2 connection as well.

Case 2: Convert SW2 connection to R3 into an access-port in VLAN2:

SW2:
interface FastEthernet 1/3
 switchport access vlan 2
 switchport mode access

Rack1SW2#
%SPANTREE-7-RECV_1Q_NON_TRUNK: Received 802.1Q BPDU on non trunk FastEthernet1/3 VLAN2.
%SPANTREE-7-BLOCK_PORT_TYPE: Blocking FastEthernet1/3 on VLAN2. Inconsistent port type.PVST+: restarted the forward delay timer for FastEthernet1/3

Now the access port receives untagged SSTP BPDUs from SW1 with VLAN ID TLV equal to 1. Since this does not match the locally configured VLAN, the port is declared «type-inconsistent» – the other side is trunk and the local side is an access port.

Note that it is still possible to configure the access port in VLAN1 if the other switch trunks are all using the default native VLAN. This way, only CST is expanded through the MST region and joined with VLAN1 STP instance in SW2. To summarize, if you have a group of Cisco switches connected to the same MSTP cloud (other vendor’s switches) observe the following rules:

1) Use the same interface types – i.e. either all switches are connected using access ports or using trunk ports.

2) Ensure Native VLAN (PVID) is the same on all trunks used to connect to an MST region.

Now let’s see the implications that arise from the fact that non-CST VLANs see the MST region as simply a link or a “hub”. Based on that fact, it is possible to adjust priority values on not physically directly connected switches and influence root port election. For example, set the port priority of direct connection between SW1 and SW2 (port FastEthernet 1/15 to 64 (the default is 128):

SW1:
interface FastEthernet 1/15
 spanning-tree port-priority 64

Rack1SW2#show spanning-tree vlan 2

 VLAN2 is executing the ieee compatible Spanning Tree protocol
  Bridge Identifier has priority 8192, address cc07.00c1.0006
  Configured hello time 2, max age 20, forward delay 15
  Current root has priority 8192, address cc06.00c0.0006
  Root port is 56 (FastEthernet1/15), cost of root path is 19
  Topology change flag not set, detected flag not set
  Number of topology changes 15 last change occurred 00:07:22 ago
          from FastEthernet1/3
  Times:  hold 1, topology change 35, notification 2
          hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 44 (FastEthernet1/3) of VLAN2 is blocking
   Port path cost 19, Port priority 128, Port Identifier 128.44.
   Designated root has priority 8192, address cc06.00c0.0006
   Designated bridge has priority 8192, address cc06.00c0.0006
   Designated port id is 128.44, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 0
   BPDU: sent 0, received 8

 Port 56 (FastEthernet1/15) of VLAN2 is forwarding
   Port path cost 19, Port priority 128, Port Identifier 128.56.
   Designated root has priority 8192, address cc06.00c0.0006
   Designated bridge has priority 8192, address cc06.00c0.0006
   Designated port id is 64.56, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 3
   BPDU: sent 9, received 4683

Note that in above output, on both ports (root and alternative) we see the same designated bridge (SW1) – even though SW2 directly connects to R3 using port FastEthernet 1/3. Now let’s make port Fa 1/3 of SW2 the root port by adjust the priority of port connecting SW1 to R3:

SW1:
interface FastEthernet 1/3
 spanning-tree port-priority 32

Rack1SW2#show spanning-tree vlan 3

 VLAN3 is executing the ieee compatible Spanning Tree protocol
  Bridge Identifier has priority 32768, address cc07.00c1.0001
  Configured hello time 2, max age 20, forward delay 15
  Current root has priority 8192, address cc06.00c0.0007
  Root port is 44 (FastEthernet1/3), cost of root path is 19
  Topology change flag set, detected flag not set
  Number of topology changes 6 last change occurred 00:00:31 ago
          from FastEthernet1/15
  Times:  hold 1, topology change 35, notification 2
          hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 44 (FastEthernet1/3) of VLAN3 is forwarding
   Port path cost 19, Port priority 128, Port Identifier 128.44.
   Designated root has priority 8192, address cc06.00c0.0007
   Designated bridge has priority 8192, address cc06.00c0.0007
   Designated port id is 32.44, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 1
   BPDU: sent 1, received 170

 Port 56 (FastEthernet1/15) of VLAN3 is blocking
   Port path cost 19, Port priority 128, Port Identifier 128.56.
   Designated root has priority 8192, address cc06.00c0.0007
   Designated bridge has priority 8192, address cc06.00c0.0007
   Designated port id is 64.56, designated path cost 0
   Timers: message age 1, forward delay 0, hold 0
   Number of transitions to forwarding state: 3
   BPDU: sent 4, received 4850

To recap what has been said in a few words: PVST+ is used on 802.1q trunks to tunnel PVST instances across an MST cloud and build a CST consisting of PVST VLAN 1 and IEEE MST. PVST+ BPDUs contain special TLV with the source VLAN ID, which allows interconnected switches to detect inconsitencies or misconfigurations.

Tags: , , , , ,

Jul
05

UDLD (Unidirectional Link Detection) is Cisco proprietary extension for detecting a mis-configured link. The idea behind it is pretty strighforward – allow two switches to verify if they can both send and receive data on a point-to-point connection. Consider a network with two switches, A and B connected by two links: “A=B”. Naturally, if “A” is the root of spanning tree, one of the ports on “B” will be blocking, constantly receiving BPDUs from “A”. If this link would turn uni-directional and “B” would start missing those BPDUs, the port will eventually unblock, forming a loop betwen “A” and “B”. Note that the problem with unidirectional links usually occurs on fiber-optical connections and is not common on UTP (wired) connections, where link pulses are used to monitor the connection integrity.

The confusion about UDLD is that Cisco provides quite unclear description of the feature operations be it on CatOS or IOS platform. So here is a short overview of how UDLD works.

1) Both UDLD peers (switches) discover each other by exchanging special frames sent to well-known MAC address 01:00:0C:CC:CC:CC. (Naturally, those frames are only understood by Cisco switches). Each switch sends it’s own device ID along with the originator port ID and timeout value to it’s peer. Additionally, a switch echoes back the ID of it’s neighbor (if the switch does see the neighbor). Since some versions of CatOS and IOS you can change UDLD timers globally.

2) If no echo frame with our ID has been seen from the peer for a certain amount of time, the port is suspected to be unidirectional. What happens next depends on UDLD mode of operations.

3) In “Normal” mode, if the physical state of port (as reported by Layer 1) is still up, UDLD marks this port as “Undetermined”, but does NOT shut down or disable the port, which continues to operate under it’s current STP status. This mode of operations is informational and potentially less disruptive (though it does not prevent STP loops). You can review the “undetermined” ports using CLI show commands when troubleshooting the STP issues though.

3) If UDLD is set to “Agressive” mode, once the switch loses it’s neighbor it actively tries to re-establish the relationship by sending a UDLD frame 8 times every 1 second (surpisingly this coincides with TCP keepalives retry values used by FCIP on Cisco MDS storage switches :) . If the neighbor does not respond after that, port is considered to be unidirectional and brought to “Errdisable” state. (Note that you can configure “errdisable recovery” to make switch automatically recover from such issues)

4) UDLD “Aggressive” will only brings link to errdisable state when it detects “Bidirectional” to “Unidirectional” state transition. In order for a link to become “Bidirectional”, UDLD process should first hear an echo packet with it’s own ID from a peer on the other side. This prevents link from becoming errdisabled when you configure “Aggressive” mode just on one side. The UDLD state of such link will be “Unknown”.

5) UDLD “Aggressive” inteoperates with UDLD “Normal” on the other side of a link. This type of configuration means that just one side of the link will be errdisabled once “Unidirectional” condition has been detected.

To complete this overview, remember that UDLD is designed to be a helper for STP. Therefore, UDLD should be able to detect an unidirectional link before STP would unblock the port due to missed BPDUs. Thus, when you configure UDLD timers, make sure your values are set so that unidirectional link is detected before “STP MaxAge + 2xForwardDelay” expires. Additionally, notice that UDLD function is similar to STP Loopguard and Bridge Assurance feature found in newer switches. The benefit of UDLD is that it operates at physical port-level, whereas STP may not be able to detect a malfunctioning link bundled in an Etherchannel. This is why you normally use all features together – they don’t replace but truly complement each other.

Tags: , , ,

Feb
05

Within the scope of Metro Ethernet services, it is often beneficial to provide customers “point-to-point” VLAN service, where VLAN (multipoint service in essence) is effectively set up to emulate ethernet “pseudowire”, by disabling MAC-address learning. The benefit comes from saving metro switches CAM tables address space, thus improving overall scalability (which is far from perfect with Ethernet). There is special command, mac address-table learning available on Cisco Metro swtiches (e.g. ME 3400) which allows to disable MAC-address learning per specific VLAN. However, many commonly used switches does not have this feature implemented. Still, there is a way to disable MAC-address learning on a group of ports, by using RSPAN VLAN feature. By it’s functional design, RSPAN VLAN does not learn MAC addresses. However, we are not allowed to assign this type of VLAN directy to switch access ports. Still, we may overcome this issue by configuring switchports as trunk with a single allowed VLAN (RSPAN VLAN) which is also configured as native:

vtp mode transparent
!
vlan 555
 remote-span
!
interface range Fa 0/1 - 3
 switchport trunk encapsulation dot1q
 switchport mode trunk
 switchport trunk allowed vlan 555
 switchport trunk native vlan 555

This configuration is applicable to any switch that supports RSPAN functionality. Specifically, it was verified on Catalyst 3550 series.

Tags: , , , , , , ,

Categories

CCIE Bloggers