Mar
14

We are going to discuss Cisco’s proprietary extensions to STP algorithm, namely UplinkFast and BackBone Fast. Those two features aim to reduce the time it takes STP to re-activate topology after a link failure. While UplinkFast seems pretty intuitive, BB Fast is more complicated.

See more detailed overview at: http://blog.ine.com/wp-content/uploads/2010/04/understanding-stp-rstp-convergence.pdf

UplinkFast

stp-convergence212

Look at Figure 1 above. It demonstrates a sample network topology (pretty poorly designed, by the way) with a backbone (core) and access layers. Everything is Layer 2 in this topology, and you can see the STP subset of the topology highlighted in green. UplinkFast extension was designed for use on access-layer switches, such as AC-SW1 and AC-SW2. Commonly, those switches have redundant uplinks to the core/distribution layer. If the primary uplink fails it would take 2xForward_Time for the backup uplink to come up. This is only in the case if the failure would be detected immediately, without aging the BPDU stored for the primary uplink. However, it makes sense to bring the secondary uplink up immediately, as soon as the primary’s failure has been detected, as there are no more uplinks. Even in case when the access-switch has more than two uplinks, we can still transition the next available uplink to the operational state, since the access-switch is not supposed to become a transit path for any traffic. If the access-switch could become a transit point, we cannot use the same trick, as this might introduce loops into the active topology. Therefore, we can quickly transition the backup link into the forwarding state if:

1) The switch has only two uplinks.
2) The switch has more than two uplinks, but the STP parameters are set in such way, that the switch would never become a transit node.

The second condition could be fulfilled if we set the STP priority of the switch to a value that makes it almost impossible to become a root bridge (numerically large value) plus increasing all ports STP cost, making all transit paths via this switch less preferred. This is what Cisco calls “UplinkFast” feature. When you enable it on a switch, the failure of the root port will transition the secondary uplink (alternate port) into forwarding state almost immediately.

The last component of UplinkFast feature is quick MAC-address table re-learning. Since the primary uplink has failed, the core switches might lose the MAC addresses associated with the access switch. This will result in connectivity disruption, till the time the addresses are re-learned via the secondary path. Thus, in addition to bringing the secondary uplink up quickly, the switch will also flood it with dummy multicast frames, sourced from all the MAC addresses known to the switch. This will allow the upstream switches (the destination is multicast) to quickly re-learn the MAC addresses via the new path. Of course, the penalty is excessive network flooding with multicast frames.

BackboneFast

In the previous post, we described how STP handles inferior BPDUs. In short, classic algorithm simply ignores the potentially useful information conveyed by inferior BPDUs. Look at Figure 2 below.

stp-convergence22

What if the link between BB-SW2 and BB-SW3 fails? First, since BB-SW2 is the root, the failure of the designated port will only cause a topology change even. However, things are more complicated for BB-SW3, since it loses the root port. BB-SW3 will invalidate the currently know root bridge information, and try looking for another alternative. Since the port on BB-SW4 connected to BB-SW3 is blocking, there is now new information. Thus BB-SW3 declares itself the root of the STP and starts sending inferior BPDUs to BB-SW4. If the latter would be a classic STP switch, it would ignore the inferior information until the BPDU stored with the blocked port expires (around Max_Age seconds). However, with BackBone fast enable, the switch that receives inferior information will attempt to verify if this failure affected its own connection to the root (e.g. whether the current root bridge is actually dead or we lost the connection to it just like our neighbor).

1) The switch selects the root port and all alternate ports (all upstream ports) as the candidate paths to the current root. In case of BB-SW4 there is just one root link to BB-SW2 and there is ALN path via BB SW3. The switch then sends out special RLQ (Root Link Query) BPDUs out of the selected ports, in our case to BB-SW4 and BB-SW3. What these BPDUs contain is the following (among other fields):

a) The Bridge ID of the querying switch (local switch BID)
b) The Bridge ID of our current Root Bridge (or what the querying switch thinks is the current root bridge).

2) Every switch that receives the RLQ, checks the Root Bridge ID in the query.
2.1) If this is the same BID that receiving switch consider to be the root, it relays the RLQ upstream, across its root port.
2.2) If the switch receiving the RLQ is the root bridge itself, it floods a positive RLQ response out of ALL its designated (downstream) ports. In our situation, this is the case, and BB-SW2 immediately responds to BB-SW4.
2.3) If the switch receiving the RLQ considers a different bridge to be the root of the topology, it immediately responds with a negative RLQ, flooded out of all designated (downstream) ports.

3) RLQ responses are flooded by every switch downstream out of all designated ports. Only the switch that sees itself as the originator of the RLQ will not flood the responses further. This is how the RLQ responses are eventually delivered to the querying bridge.

4) When the originating switch receives a negative response on any upstream port, it immediately invalidates the information stored with this port, and moves it to the Listening state, starting BPDU exchange. This happens with the port connected to BB-SW3 in our case. If the answer is positive, the information stored with the port is considered to be valid. The switch waits for responses on all upstream ports, retaining or invalidating the respective stored BPDUs.

5) If ALL responses were negative, the querying switch deduces loss of the connectivity to the old root. The querying switch declares itself as the new root and starts propagating this information out of all ports, listening for better BPDUs at the same time.

6) If at least ONE response was positive, the querying switch assumes that it still has healthy connection to the current root, unblocks the ports that have received negative responses. This allows the switch to start sending information about the current root to the switch that thinks it lost connection to the root bridge. In our case, BB-SW4 receives a positive response from BB-SW2 and immediately unblocks the port to BB-SW3, starting to relay BB-SW2 configuration BPDUs.

Therefore, the overall effect of BB Fast is proactive testing of the current topology and quick invalidation of the outdated information. Backbone Fast allows saving up to Max_Age seconds of waiting to expire the old root bridge information and thus reduce the convergence time to 2xForward_Time. The BB Fast feature is only useful on switches that are capable of becoming a transit node, i.e. the switches that form the core (backbone) of the bridged network. Of course, since BB Fast uses special RLQ BPDUs, it must be explicitly enabled in all participating Cisco switches.

Finally, a quick question to think about after reading this post: Is if OK to enable the UplinkFast feature on all switches forming the “ring” topology ?

Mar
07

In this post we are going to look into STP convergence process. Many people have perfect understanding of STP, but yet face difficulties when they see questions like “How many seconds will it take for STP to recover connectivity if a given link fails?”. The post will follow the outline below:

1) General overview of STP convergence process
2) How STP converges if a directly connected link fails
3) How STP converges when it detects indirect link failure
4) Topology changes and their effect

See more detailed overview at: http://blog.ine.com/wp-content/uploads/2010/04/understanding-stp-rstp-convergence.pdf

STP Convergence in General

As we know, STP protocol follows certain simple procedure to calculate the loop-free subset of the network topology. STP protocol could be compared to RIP in some sense. Both execute a version of Bellman-Ford iterative algorithm, which could be described as “gradient” (meaning it iteratively looks for the optimal solution, selecting the “closest” candidate every time). Every switch accepts and retains only the best current root bridge information. The switch then blocks alternate paths to the root bridge, leaving only the single optimal (in terms of path cost) uplink and continues relaying the optimal information. If a switch learns about a better (“superior”) root bridge than it knows now (e.g. better bridge id, or shorter path to the root), the old information is erased and the new one immediately accepted and relayed. Note that the switch stores the most recent STP BPDUs with every port that receives them. Therefore, for a given switch, there is a BPDUs stored with every root or alternate (blocked port).

There are certain features in STP designed to improve the algorithm stability and ensure the aging out of the old information. Every BPDU contains two fields: Max_Age and Message_Age. The Message_Age field is incremented every time a BPDU traverses a switch (so it might be compared to the hop count). When a switch stores the BPDU with the respective port, it will count the time in seconds, starting from Message_Age and up to the Max_Age. If during this interval, no further BPDUs are received, the current BPDU is wiped out and the port is declared designated. This procedure ensures that the old information is eventually aged out of the topology.

There is one more thing, similar to the “hold-down” feature found in RIP. It is the way in which STP deals with “inferior” BPDUs. The BPDU is considered inferior, if it carries information about the root bridge that is worse than the one currently stored for the port, or the BPDU has longer distance to reach the current root bridge (compare this to RIP's increase in metric). Inferior BPDUs may appear when a neighboring switch suddenly loses its uplink and claims itself the new root of the topology. By default, every switch should ignore inferior BPDUs, until the currently stored BPDU expires (time=Max_Age - Message_Age). This feature intends to stabilize STP topology in situations where an uplink on some switch flaps, causing the switch to start sending inferior information.

STP convergence in case of directly connected link failure

Consider a switch on Fig 1., with two uplinks – one forwarding (root port, port A) and another blocking (alternate port, port B). Imagine now that the root port fails.

stp_convergence_1

There are two different situations:

1) The switch detects loss of carrier and immediately declares the port dead. Since this was the port with the best information, the switch immediately invalidates it, and selects the next “best candidate” which is the alternate port (Port B) as the new root port. The switch will transition Port B through Listening and Learning states, which takes 2xForward_Time. Therefore, the connectivity is restored in 2xForward_Time.

2) The switch does not detect the loss of carrier (for example, the uplink is fiber connected to a converter or connects through a hub), and thus the port remains up. The root port, however, loses the continuous stream of BPDUs. Thus, the stored BPDU information is no longer updated. Based on the default procedure, it takes time=Max_Age-Message_Age to expire the stored information. After this, the switch considers the BPDU stored with the alternate port, and unblock Port B. It will take another 2xForward_Delay to bring the port to forwarding state. Therefore, the connectivity is resotored in 2xForward_Time + (Max_Age-Message_Age).

If the switch detects loss of carrier on the designated port (Port C) nothing much will happen. Since there are no BPDUs received on this port, the switch will only generate a topology change event (more on that later), but will not block or unblock any other local ports. This event, might, however, affect the downstream switches.

STP Convergence in case of indirect link failure

Consider the topology on Fig 2.

stp_convergence_2

In this case, SW2 has better Bridge ID than SW3, and thus Port D is designated on the segment between SW2 and SW3. SW3 blocks the redundant uplink to via SW3 (Port B) and elects Port A as the root port. Now imagine that SW2 detects loss of carrier on the link connected to SW1 (Port C). The switch will immediately invalidate the best BPDU stored for Port C, and will assume itself the root of the spanning-tree, as there are no other ports receiving BPDUs. SW2 will start advertising BPDUs to SW3, setting the designated and the root bridge to itself in the configuration BPDUs. Those are, by definition, inferior BPDUs, and SW3 will ignore them, as it still hears better information from SW1. SW3 will also keep the previous BPDU associated with Port B for the duration of Max_Age-Message_Age. When this timer expires, SW3 will start considering the inferior BPDUs. Port B will move to Listening state, and SW3 will start relaying SW1’s BPDUs to SW2, as those are superior to SW2’s BPDUs. Now, SW2 would detect the better information on its formerly designated port (Port D) and will cycle the port through Listening and Learning states. Both switches (SW2 and SW3) will eventually move their ports into forwarding states, recovering the connectivity. Therefore, it will take Max_Age-Message_Age + 2xForward_Time to recover from indirect link failure.

The effect of topology changes

Switches forward Ethernet frames based on their MAC address tables (filtering tables) that bind MAC addresses to egress ports. When a change in topology occurs (e.g. a link failure) the MAC address tables may appear to be invalid, as the paths between switches have changed. The switches may eventually re-learn the new information, but it may take considerable time, especially if the traffic is scarce and MAC address aging time is large (5 minutes by default). Based on that, if switch detects a change in the topology (e.g. link going up or down), it should notify all other switches that something has changed. In response to this notification, all switches will reduce their MAC address aging time to Forward_Time (15 secs by default) effectively fastening the aging process.

As we know, topology changes are signaled via special TCN BPDU, which is being sent upstream from the originating switch (the one that detected the change) to the root switch via the root ports. As the root switch hears the TCN BPDU, it will set TCN ACK flag in all its outgoing configuration BPDUs for the duration of Max_Age+Forward_Time. All switches that see this flag, will set their MAC address tables aging time to Forward_Time. Once the switch that originated the TCN BPDU will hear the TCN ACK, it will stop signaling about the topology change.

Now what is the effect of a topology change event? Two major things are impacted:

1) Connectivity. In some cases, it may time additional Forward_Delay seconds to expire the old MAC address information and recover connectivity. This may only happen if the old information persists in some switches, and the frames are black-holed.

2) Network performance. Shortening the MAC address table aging time results in less stable topology. When a switch loses a MAC address, it starts flooding frames for this destination, effectively acting like a hub. If the flow of packets in your network is not intense enough, the switches may start losing MAC address table information, resulting in excessive traffic flooding.

The second issue might become pretty dangerous with high number of topology changes. Excessive flooding might severely impact your network performance. Note, that this issue also pertains to L2 topologies that runs RSTP, as the topology changes are handled in the similar way. In order to reduce the number of topology changes, configure all edge ports in the topology (connected to hosts, IP Phones, servers) as spanning-tree portfast. Portfast ports do not generate TC events when they go up or down.

For more detailed description of topology change notification read the following great article at Cisco’s site:

Understanding Spanning-Tree Topology Changes

Part II of this post will consider UplinkFast and BackboneFast features, and their effect on STP convergence.

PS
We often use the formula Max_Age-Message_Age in this text, to be precise. However, most STP topologies are small enough to ignore Message_Age and assume the value of Max_Age for most calculations, unless Max_Age is artificially set to a very low value.

Jul
17

Cisco switches run different types of STP protocol, depending on whether the connected port is access, ISL trunk or 802.1q trunk. Natively, Cisco switch runs a separate STP instance for each configured and active VLAN (this is called Per-VLAN Spanning Tree or PVST) and standard IEEE compliant switches run just one instance of STP shared by all VLANs. Due to that reason, a group of switches running IEEE compatible STP protocol is called MST (Mono Spanning Tree) region, not to be confused with MSTP (multiple spanning tree).

Access Ports. Cisco switches run the IEEE version of STP protocol on the access ports. The IEEE STP BPDUs are sent to IEEE reserved multicast MAC address “0180.C200.0000” using IEEE 802.2 LLC SAP encapsulation with both SSAP and DSAP fields equal to “0x42”. (By the way, for the purpose of Layer2 filtering, IEEE BPDUs could be matched using a MAC ACL with LSAP value of ”0x4242”). Note that you can plug any standard IEEE compliant switch into a Cisco switch access port and they will inter-operate perfectly, joining the respective access VLAN's STP instance with the IEEE STP instance (MST).

ISL Trunks. Cisco switches run PVST (Per-VLAN Spanning Tree) on ISL trunk links. The same IEEE STP BPDUs are sent for each VLAN, encapsulated with additional ISL header, which also carries the VLAN number. The magic part is that ISL header has special flag to distinguish frames carrying STP BPDUs and this is how PVST can re-use the regular IEEE BPDUs to simulate multiple spanning trees. Since PVST BPDUs have the same format as IEEE BPDUs (that is IEEE 802.2 LLC SAP) they can be matched using the same SSAP/DSAP values of “0x42” for the purpose of Layer 2 filtering.The group of Cisco switches connected using ISL trunks only is called PVST region.

802.1q Trunks. Cisco switches run PVST+ (Per VLAN Spanning Tree Plus) on 802.1q trunks. This is where things are getting a bit complicated. The goal of PVST+ is to interoperate with standard IEEE STP (MST) and allow transparent tunneling of PVST BPDUs across MST region to connect to other Cisco switches across the IEEE region. For further consideration, we'll call a group of Cisco switches connected using 802.1q trunks as PVST+ region. Note that PVST+ region may connect to a PVST region using an ISL trunk and connect to MST region using a 802.1q trunk. The STP instances in PVST and PVST+ regions maps directly to each other, so no special interoperability solution is required. However, on IEEE side only one STP instance exists, contrary to multiple STP instances of PVST+ region. The first question is: if we want to interoperate with IEEE single tree, which PVST VLAN’s STP instance should be joined with MST? Cisco has chosen VLAN 1 for this purpose. Joined together, instances of Cisco VLAN 1 STP and MST are called “Common Spanning Tree” or CST (naturally, CST spans PVST, PVST+ and MST regions). As for the detailed PVST+ operations on 802.1q port, consider two following cases:

Case 1: Cisco switch connects to an IEEE switch using a 802.1q trunk with default native VLAN (VLAN 1)

IEEE side sends IEEE STP BPDUs to the IEEE multicast MAC address. These BPDUs are consumed and processed by VLAN 1 STP instance on Cisco switch (PVST+ region). The Cisco switch simply recognizes untagged IEEE BPDUs as belonging to VLAN 1 STP instance.

The PVST+ side (Cisco switch) sends IEEE STP BPDUs corresponding to local VLAN 1 STP to IEEE MAC address as untagged frames across the link. At the same time, special new SSTP (shared spanning tree, synonym to PVST+) BPDUs are being sent to SSTP multicast MAC address “0100.0ccc.cccd” also untagged. These SSTP BPDUs are encapsulated using IEEE 802.2 LLC SNAP header (SSAP=DSAP=”0xAA” and SNAP PID=”0x010B”). These BPDUs carry the same information as the parallel IEEE STP BPDUs for VLAN 1, but have additional fields, notably a special TLV with the source VLAN number. Note that IEEE switches do not interpret the SSTP BPDUs, but simply flood them through the respective VLAN topology. This facilitates BPDU tunneling in case there are other Cisco switches connected to IEEE cloud.

As for non-native VLANs (VLANs 2-4095) Cisco switch sends only SSTP BPDUs, tagged with respective VLAN number and destined to the SSTP MAC address. (Please remember that all SSTP BPDUs carry a VLAN number they belong to). The respective VLAN STP instances are “transparently expanded” across the MST region, considering it as a “virtual hub”. (Note that this may have some traffic engineering implications, since to non-CST VLANs the cost of traversing MSTP region equals to the cost of the link used to connect to the first MSTP switch).

Now the question is, why would Cisco switch send the same VLAN1 BPDU twice - destined to IEEE and SSTP multicast MAC addresses? Isn’t it supposed for the Cisco switch to join its VLAN 1 STP instance with the MST? The reason for sending additional SSTP BPDUs across VLAN 1 is purely informational, to perform consistency checking. The idea is to inform all other potential Cisco switches attached to MST cloud about our native VLAN. The receiving switch will only use IEEE BPDUs for VLAN 1 (CST) computations and will ignore SSTP BPDUs sent on VLAN 1.

For the purpose of layer 2 filtering, remember that you can match SSTP BPDUs using an ethertype value “0x010B”.This works with multilayer switches even though SSTP BPDUs are SNAP encapsulated, and the actual field is not “ethertype” but rather a SNAP Protocol ID.

Case 2: Cisco switch connects to IEEE switch using 802.1q trunk with non-default native VLAN (e.g VLAN 100).

IEEE switch sends IEEE STP BPDUs to IEEE multicast MAC address and those BPDUs are processed by VLAN 1 (CST) STP instance in the Cisco switch.

PVST+ side (Cisco switch) sends untagged IEEE STP BPDUs corresponding to VLAN 1 (CST) STP to IEEE MAC address across the link. This is done for the purpose of joining the local VLAN 1 instance and the IEEE instance into CST. At the same time, VLAN 1 BPDUs are replicated to SSTP multicast address, tagged with VLAN 1 number (to inform other Cisco switches that VLAN 1 is non-native on our switch). Finally, BPDUs of the native VLAN instance (VLAN 100 in our case) are sent untagged using SSTP encapsulation and destination address. Of course, native VLAN100 BPDUs, (even though they are untagged) carry VLAN number inside a special TLV SSTP header.

As in Case 1 for the remaining non-native VLANs (VLANs 2-4095) Cisco switch sends SSTP BPDU only, tagged with respective VLAN tag and destined to the SSTP MAC address. The other Cisco switches connected to the MSTP cloud receive the SSTP BPDUs and process the using the respective VLAN STP instances.

Consider the diagram below. SW1 and SW2 are Cisco switches, running STP instances for VLANs 1,2 and 3. Router R3 is converted into bridge (CRB mode and the same bridge-group on E0/0 and E0/1 interfaces, using IEEE STP protocol) to simulate an IEEE compatible switch. VLAN 1 STP instances of SW1 and SW2 are joined with STP instance running in R3. VLANs 2 and 3 consider path across R3 as another segment linking SW1 and SW2 and their SSTP information is passed transparently across R3.

pvst-plus-topology

The current situation in the topology is as following:

Rack1SW2#show spanning-tree vlan 1

VLAN1 is executing the ieee compatible Spanning Tree protocol
Bridge Identifier has priority 32768, address cc07.00c1.0000
Configured hello time 2, max age 20, forward delay 15
Current root has priority 32768, address cc02.00c0.0000
Root port is 44 (FastEthernet1/3), cost of root path is 19
Topology change flag set, detected flag not set
Number of topology changes 12 last change occurred 00:00:39 ago
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300

Port 44 (FastEthernet1/3) of VLAN1 is forwarding
Port path cost 19, Port priority 128, Port Identifier 128.44.
Designated root has priority 32768, address cc02.00c0.0000
Designated bridge has priority 32768, address cc02.00c0.0000
Designated port id is 128.5, designated path cost 0
Timers: message age 2, forward delay 0, hold 0
Number of transitions to forwarding state: 2
BPDU: sent 7650, received 2445

Port 56 (FastEthernet1/15) of VLAN1 is blocking
Port path cost 19, Port priority 128, Port Identifier 128.56.
Designated root has priority 32768, address cc02.00c0.0000
Designated bridge has priority 32768, address cc06.00c0.0000
Designated port id is 128.56, designated path cost 19
Timers: message age 3, forward delay 0, hold 0
Number of transitions to forwarding state: 2
BPDU: sent 7, received 3827

Rack1SW2#show spanning-tree vlan 2

VLAN2 is executing the ieee compatible Spanning Tree protocol
Bridge Identifier has priority 8192, address cc07.00c1.0006
Configured hello time 2, max age 20, forward delay 15
Current root has priority 8192, address cc06.00c0.0006
Root port is 44 (FastEthernet1/3), cost of root path is 19
Topology change flag set, detected flag not set
Number of topology changes 7 last change occurred 00:00:14 ago
from FastEthernet1/15
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300

Port 44 (FastEthernet1/3) of VLAN2 is forwarding
Port path cost 19, Port priority 128, Port Identifier 128.44.
Designated root has priority 8192, address cc06.00c0.0006
Designated bridge has priority 8192, address cc06.00c0.0006
Designated port id is 128.44, designated path cost 0
Timers: message age 2, forward delay 0, hold 0
Number of transitions to forwarding state: 2
BPDU: sent 5889, received 372

Port 56 (FastEthernet1/15) of VLAN2 is blocking
Port path cost 19, Port priority 128, Port Identifier 128.56.
Designated root has priority 8192, address cc06.00c0.0006
Designated bridge has priority 8192, address cc06.00c0.0006
Designated port id is 128.56, designated path cost 0
Timers: message age 2, forward delay 0, hold 0
Number of transitions to forwarding state: 1
BPDU: sent 2, received 3821

Rack1SW2#show spanning-tree vlan 3

VLAN3 is executing the ieee compatible Spanning Tree protocol
Bridge Identifier has priority 32768, address cc07.00c1.0001
Configured hello time 2, max age 20, forward delay 15
Current root has priority 8192, address cc06.00c0.0007
Root port is 44 (FastEthernet1/3), cost of root path is 19
Topology change flag set, detected flag not set
Number of topology changes 4 last change occurred 00:00:16 ago
from FastEthernet1/15
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300

Port 44 (FastEthernet1/3) of VLAN3 is forwarding
Port path cost 19, Port priority 128, Port Identifier 128.44.
Designated root has priority 8192, address cc06.00c0.0007
Designated bridge has priority 8192, address cc06.00c0.0007
Designated port id is 128.44, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 2
BPDU: sent 3689, received 332

Port 56 (FastEthernet1/15) of VLAN3 is blocking
Port path cost 19, Port priority 128, Port Identifier 128.56.
Designated root has priority 8192, address cc06.00c0.0007
Designated bridge has priority 8192, address cc06.00c0.0007
Designated port id is 128.56, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 2
BPDU: sent 3, received 3832

R3 (the IEEE bridge) is the root for VLAN 1 (CST) and SW1 is the root for VLANs 2 and 3. Note that R3 has no idea about VLANs 2 and 3 and transparently floods SSTP BPDUs which could be seen on SW2 (non-root switch). As it should be, VLAN 1 BPDUs are received as IEEE STP BPDUs and SSTP copies are ignored.

Rack1SW2#debug spanning-tree bpdu
Spanning Tree BPDU debugging is on
Rack1SW2#
STP: VLAN1: config protocol = ieee, packet from FastEthernet1/3 , linktype IEEE_SPANNING , enctype 2, encsize 17
STP: enc 01 80 C2 00 00 00 CC 02 00 C0 00 01 00 26 42 42 03
STP: Data 00000000008000CC0200C00000000000008000CC0200C0000080050000140002000F00
STP: VLAN1 Fa1/3:0000 00 00 00 8000CC0200C00000 00000000 8000CC0200C00000 8005 0000 1400 0200 0F00

STP: VLAN1: config protocol = ieee, packet from FastEthernet1/15 , linktype IEEE_SPANNING , enctype 2, encsize 17
STP: enc 01 80 C2 00 00 00 CC 06 00 C0 F1 0F 00 26 42 42 03
STP: Data 00000000008000CC0200C00000000000138000CC0600C0000080380100140002000F00
STP: VLAN1 Fa1/15:0000 00 00 00 8000CC0200C00000 00000013 8000CC0600C00000 8038 0100 1400 0200 0F00

STP: VLAN3: config protocol = ieee, packet from FastEthernet1/3 , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 03 00 32 AA AA 03 00 00 0C 01 0B
STP: Data 00000000002000CC0600C00007000000002000CC0600C00007802C0000140002000F00
STP: VLAN3 Fa1/3:0000 00 00 00 2000CC0600C00007 00000000 2000CC0600C00007 802C 0000 1400 0200 0F00

STP: VLAN3: config protocol = ieee, packet from FastEthernet1/15 , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 0F 00 32 AA AA 03 00 00 0C 01 0B
STP: Data 00000000002000CC0600C00007000000002000CC0600C0000780380000140002000F00
STP: VLAN3 Fa1/15:0000 00 00 00 2000CC0600C00007 00000000 2000CC0600C00007 8038 0000 1400 0200 0F00

STP: VLAN2: config protocol = ieee, packet from FastEthernet1/3 , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 03 00 32 AA AA 03 00 00 0C 01 0B
STP: Data 00000000002000CC0600C00006000000002000CC0600C00006802C0000140002000F00
STP: VLAN2 Fa1/3:0000 00 00 00 2000CC0600C00006 00000000 2000CC0600C00006 802C 0000 1400 0200 0F00
STP: VLAN2: config protocol = ieee, packet from FastEthernet1/15 , linktype SSTP , enctype 3, encsize 22
STP: enc 01 00 0C CC CC CD CC 06 00 C0 F1 0F 00 32 AA AA 03 00 00 0C 01 0B

Now let’s see how consistency checking is performed by PVST+.

Case 1: Change the native VLAN on SW1 connection to R3:

SW1:
interface FastEthernet 1/3
switchport trunk native vlan 2

Rack1SW2#
%SPANTREE-2-RECV_PVID_ERR: Received BPDU with inconsistent peer vlan id 2 on FastEthernet1/3 VLAN1.

%SPANTREE-2-BLOCK_PVID_PEER: Blocking FastEthernet1/3 on VLAN2. Inconsistent peer vlan.PVST+: restarted the forward delay timer for FastEthernet1/3

%SPANTREE-2-BLOCK_PVID_LOCAL: Blocking FastEthernet1/3 on VLAN1. Inconsistent local vlan.PVST+: restarted the forward delay timer for FastEthernet1/3

Note that SW2 detects untagged packet with VLAN ID 2, which does not correspond to the locally configured default native VLAN 1. The corresponding port is put in «inconsistent» state. The reason SW2 detects this condition (and not SW1) is because SW1 sending SSTP BPDUs and SW2 is not (it receives superios BPDUs). As soon as native VLAN is converted back to «1» on SW1, consistency is restored:

%SPANTREE-2-UNBLOCK_CONSIST_PORT: Unblocking FastEthernet1/3 on VLAN2. Port consistency restored.

%SPANTREE-2-UNBLOCK_CONSIST_PORT: Unblocking FastEthernet1/3 on VLAN1. Port consistency restored. PVST+:Inconsistency timer expired. inconsistency 0
cleared for FastEthernet1/3
PVST+:Inconsistency timer expired. inconsistency 0
cleared for FastEthernet1/3

It is also possible to restore consistency by making VLAN 2 native on SW2 connection as well.

Case 2: Convert SW2 connection to R3 into an access-port in VLAN2:

SW2:
interface FastEthernet 1/3
switchport access vlan 2
switchport mode access

Rack1SW2#
%SPANTREE-7-RECV_1Q_NON_TRUNK: Received 802.1Q BPDU on non trunk FastEthernet1/3 VLAN2.
%SPANTREE-7-BLOCK_PORT_TYPE: Blocking FastEthernet1/3 on VLAN2. Inconsistent port type.PVST+: restarted the forward delay timer for FastEthernet1/3

Now the access port receives untagged SSTP BPDUs from SW1 with VLAN ID TLV equal to 1. Since this does not match the locally configured VLAN, the port is declared «type-inconsistent» - the other side is trunk and the local side is an access port.

Note that it is still possible to configure the access port in VLAN1 if the other switch trunks are all using the default native VLAN. This way, only CST is expanded through the MST region and joined with VLAN1 STP instance in SW2. To summarize, if you have a group of Cisco switches connected to the same MSTP cloud (other vendor’s switches) observe the following rules:

1) Use the same interface types - i.e. either all switches are connected using access ports or using trunk ports.

2) Ensure Native VLAN (PVID) is the same on all trunks used to connect to an MST region.

Now let’s see the implications that arise from the fact that non-CST VLANs see the MST region as simply a link or a "hub". Based on that fact, it is possible to adjust priority values on not physically directly connected switches and influence root port election. For example, set the port priority of direct connection between SW1 and SW2 (port FastEthernet 1/15 to 64 (the default is 128):

SW1:
interface FastEthernet 1/15
spanning-tree port-priority 64

Rack1SW2#show spanning-tree vlan 2

VLAN2 is executing the ieee compatible Spanning Tree protocol
Bridge Identifier has priority 8192, address cc07.00c1.0006
Configured hello time 2, max age 20, forward delay 15
Current root has priority 8192, address cc06.00c0.0006
Root port is 56 (FastEthernet1/15), cost of root path is 19
Topology change flag not set, detected flag not set
Number of topology changes 15 last change occurred 00:07:22 ago
from FastEthernet1/3
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300

Port 44 (FastEthernet1/3) of VLAN2 is blocking
Port path cost 19, Port priority 128, Port Identifier 128.44.
Designated root has priority 8192, address cc06.00c0.0006
Designated bridge has priority 8192, address cc06.00c0.0006
Designated port id is 128.44, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 0
BPDU: sent 0, received 8

Port 56 (FastEthernet1/15) of VLAN2 is forwarding
Port path cost 19, Port priority 128, Port Identifier 128.56.
Designated root has priority 8192, address cc06.00c0.0006
Designated bridge has priority 8192, address cc06.00c0.0006
Designated port id is 64.56, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 3
BPDU: sent 9, received 4683

Note that in above output, on both ports (root and alternative) we see the same designated bridge (SW1) – even though SW2 directly connects to R3 using port FastEthernet 1/3. Now let's make port Fa 1/3 of SW2 the root port by adjust the priority of port connecting SW1 to R3:

SW1:
interface FastEthernet 1/3
spanning-tree port-priority 32

Rack1SW2#show spanning-tree vlan 3

VLAN3 is executing the ieee compatible Spanning Tree protocol
Bridge Identifier has priority 32768, address cc07.00c1.0001
Configured hello time 2, max age 20, forward delay 15
Current root has priority 8192, address cc06.00c0.0007
Root port is 44 (FastEthernet1/3), cost of root path is 19
Topology change flag set, detected flag not set
Number of topology changes 6 last change occurred 00:00:31 ago
from FastEthernet1/15
Times: hold 1, topology change 35, notification 2
hello 2, max age 20, forward delay 15
Timers: hello 0, topology change 0, notification 0, aging 300

Port 44 (FastEthernet1/3) of VLAN3 is forwarding
Port path cost 19, Port priority 128, Port Identifier 128.44.
Designated root has priority 8192, address cc06.00c0.0007
Designated bridge has priority 8192, address cc06.00c0.0007
Designated port id is 32.44, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 1
BPDU: sent 1, received 170

Port 56 (FastEthernet1/15) of VLAN3 is blocking
Port path cost 19, Port priority 128, Port Identifier 128.56.
Designated root has priority 8192, address cc06.00c0.0007
Designated bridge has priority 8192, address cc06.00c0.0007
Designated port id is 64.56, designated path cost 0
Timers: message age 1, forward delay 0, hold 0
Number of transitions to forwarding state: 3
BPDU: sent 4, received 4850

To recap what has been said in a few words: PVST+ is used on 802.1q trunks to tunnel PVST instances across an MST cloud and build a CST consisting of PVST VLAN 1 and IEEE MST. PVST+ BPDUs contain special TLV with the source VLAN ID, which allows interconnected switches to detect inconsitencies or misconfigurations.

Subscribe to INE Blog Updates

New Blog Posts!