Jul
27

Note: You may want to read a newer blog post on MSTP here Understanding MSTP

Before we begin with MSTP (Multiple Spanning Trees Protocol), I would like to note that this tutorial is going to be is divided in two parts. The first part describes how MSTP works inside a single region (the definition of the term will follow later). The second part is dedicated to MSTP region interaction with other regions and different STP protocols (IEEE STP, RSTP and Cisco PVST+).

Historical Review

First, a short history tour. In the beginning, there was IEEE STP protocol (originally, there also was DEC variant [the original] invented by Radia Perlman and IBM STP protocols, but those are fossils now), which was adapted for use with multiple VLANs and 802.1q trunks. A single shared tree, sometimes called Mono Spanning Tree by Cisco, or more often – Common Spanning Tree is shared by all VLANs. The obvious drawback of this design is impossibility to perform VLAN traffic engineering across redundant links: if a link is blocked, it is blocked for all VLANs. To overcome this, Cisco suggested its proprietary PVST/PVST+ solution, running a separate STP instance for each VLAN. This solution permits using different logical topology for each VLAN, effectively allowing for L2 traffic engineering. However, with the number of VLANs growing, PVST becomes a waste of switch resources and management burden, for the number of logical topologies is usually much smaller than the number of active VLANs.

As time passed, STP evolved into RSTP and Cisco answered with Rapid-PVST+: the fast STP, but with the same per-VLAN instance concept. The single spanning-tree instance used by IEEE and per-VLAN STP implemented by Cisco represents two poles in the space of possible solutions. Seeing the limitations of PVST approach, Cisco came with idea of decoupling the STP instance from a VLAN (they were bound together in PVST). The initial implementation was called MISTP (Multiple Instances Spanning Tree) and later evolved into new IEEE 802.1s standard called MSTP (Multiple Spanning Trees Protocol). As we would see later, this evolution process led to some terminology confusion, and small features mismatch between IEEE MSTP and Cisco MSTP implementation.

Logical and Physical Topologies

Look at the diagrams below:

Physical and Logical Topologies

Cisco’s original proposal was as follows. Instead of running an STP instance for each VLAN, let’s run a number of VLAN-independent STP instances (representing logical topologies) and then map each VLAN to the most appropriate logical topology (instance). Thus, the number of STP instances is kept to minimum (saving switch resources), but the network capacity is utilized in optimal fashion, by using all possible paths for VLAN traffic. The switch forwarding logic for VLAN traffic was changed a little bit. In order for a frame to be forwarded out of a port, two conditions must be met: first, VLAN must be active on this port (e.g. not filtered) and second, the STP instance the VLAN maps to, must be in non-discarding state for this port. Obviously, due to multiple logical topologies a single port could be blocking for one instance and forwarding for another (note that in (R)PVST+ a port is either forwarding or discarding for a VLAN).

Mapping VLANs to Instances

Implementing MSTP

Now that the basic idea is understood, let’s think how it could be implemented. The following questions need to be answered:

  • Topology Calculation. How to build multiple STP instances (logical topologies) in a single physical topology? Should we run multiple STP instances each with own BPDUs? If yes, then how would we distinguish every instance’s BPDUs: PVST+ uses VLAN tags for that, but now STP instances are independent of VLANs?
  • Information Distribution.How to make all switches aware of VLAN to instance mappings? Should we distribute instance ID along with VLAN number? If yet, then how could we ensure all switches use consistent numbering?
  • Consistency Check. How to ensure the above mapping is consistent across all switches? That is, would switch1 know that switch2 maps VLAN2 to the same instance 1 and not insance2?

Original Cisco MISTP pre-standard implementation sends separate BPDUs for each instance – this allows for separate STP calculations. Each BDPU contains instance number and a list of VLANs, mapped on sending switch to this particular instance – this allows for consistency check. However, the table to map VLANs to instance numbers has to be configured on each switch separately – that is, there is no automated mechanism to distribution VLAN to instance mappings. This is what Cisco did originally, but the IEEE 802.1s standard implementation made this mechanics more elegant. Before we continue discussing IEEE’s implementation, let’s define MSTP region as a collection of switches, sharing the same view of physical topology partitioning into set of logical topologies. For two switches to become members of the same region, the following attributes must match:

  • Configuration name
  • Configuration revision number (16 bits)
  • The table of 4096 elements which map the respective VLAN to STP instance number

IEEE 802.1s implementation does not send a BDPU for each active STP instance, nor does it encapsulate VLAN list in each configuration message. Instead of that, a special STP instance number 0 called Internal Spanning Tree (IST or MSTI0) is designated to carry all “signaling” information. The BPDUs for IST contain all standard RSTP information for IST itself, as well as carry additional informational fields. Among others fields there are configuration name, revision number and a hash value computed over VLAN to STP instance mapping table contents. Using just this compact information it’s easy to detect misconfiguration on two neighboring switches.

What about other instances, besides the IST thing? Well, obviously, all VLANs could be mapped to IST – this is the default configuration. Effectively, this represents the case of classic IEEE RSTP with all VLANs sharing the same spanning-tree. Of course, other instances also exit, and they are called MSTIs – multiple spanning tree instances. Each MSTI may assign different priorities to switches, may have different link costs, port priorities and thus end up with it’s own logical topology. Now if the 802.1s standard implementation does not send separate BDPUs for each MSTI, how does it accomplish separate topologies? The MSTIs information is piggybacked into IST BPDUs in special MRecord fields (one for every active MSTI), which carries root priority, designated bridge priority, port priority and root path cost among others. Let’s see how this whole thing works.

First of all, since MSTP convergence mechanism stems from RSTP, there is no BDPU relaying process downstream from the root bridge. Every switch emits configuration BPDUs on it’s own, every Hello interval seconds. Every BDPU has full information about IST, and also MRecord for every MSTI . Using the RSTP convergence mechanics, separate STP instances are built for IST and every MSTI, using the information from IST BPDU and MRecords (root/designated bridge priorities, port priority, root path cost etc). Note that STP timers such as Hello, ForwardTime, MaxAge could only be tuned for IST, the instance 0. All other instances (MSTIs) inherit the timers from IST – this is the natural result of all MSTI information being piggybacked in IST BPDUs. Just as a side note, MSTP does not use MaxAge timer to age out old information, like RSTP/STP do. Instead of this, IST BDPUs has special field called MaxHops. IST root sends BPDUs with hop count equal to MaxHops and every other downstream switch decrements the hop count field on reception of IST BPDU. As soon as hop count becomes zero, the information in BPDU is ignored, and the switch may start declaring itself as new IST root. The old MaxAge/ForwardDelay timers are still used when MSTP interacts with RSTP, STP or (R)PVST+ bridges.

Caveats arising from VLAN/STP decoupling

Before we jump to configuration examples, let’s consider some issues, which may arise from the fact that spanning-tree instances now are not directly tied to VLANs. The general rule should be as following: “If a VLAN is active on a particular primary link (e.g. this link is non-backup in your logical topology), ensure the STP instance it maps to is forwarding on this link”. Consider the following example:

MSTP Broken Config 1

In this topology, VLANs are manually pruned on trunks. Since the filtering is not consistent with the respective MSTI blocking decisions, VLAN2 traffic is blocked between SW1 and SW2. To avoid this situation, do not use “VLAN pruning” static method of distributing VLANs across trunks when you have MSTP enabled.

Another classic example, which may arise when you map VLANs to default IST:

MSTP Broken Config 2

Since there is just one STP instance, it blocks one of the ports. Unfortunately, this is the only port that VLAN3 can use. To avoid such situations, use separate STP for each logical topology (e.g. MSTI1 and MSTI2 in this case for VLAN2/VLAN3) and avoid mapping VLANs to IST. Keep IST only for information distribution, but load-balance traffic using MSTIs.

Configuration Example

Now that we have basic understanding of how MSTP works inside a region, let’s jump to the configuration stage. Consider the following physical topology already mentioned above:

MSTP Config Sample

The topology has VLANs 1, 10,20,30,40,50,60. We want to achieve the following:

1) VLANs 10,20,30 should follow uplink from SW3 to SW1
2) VLANs 40,50,60 should follow uplink from SW3 to SW2
3) If any of the uplinks fail, the respective VLANs should use the other uplink

To accomplish this, we need to create two MSTIs – let’s give them numbers 1 and 2. SW1 will be the root for instance 1 and SW2 will be the root for instance 2. As for IST (MSTI0), let’s make SW3 the root switch for it (though it’s not recommended to assign root roles to access switches). VLAN 1 will remain mapped to IST, VLANs 10,20,30 to MSTI1, and VLANs 40,50,60 to MSTI2. Here is the configuration (pretty simple, inside a region):

SW1:
spanning-tree mode mst
!
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 1 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

SW2:
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 2 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

SW3:
spanning-tree mst configuration
 name REGION1
 instance 1 vlan 10, 20, 30
 instance 2 vlan 40, 50, 60
!
spanning-tree mst 0 priority 8192
!
interface FastEthernet0/13
 switchport trunk encapsulation dot1q
 switchport mode trunk
!
interface FastEthernet0/16
 switchport trunk encapsulation dot1q
 switchport mode trunk

Let’s review the effect of our configuration.

SW1#show spanning-tree mst configuration
Name      [REGION1]
Revision  0     Instances configured 3

Instance  Vlans mapped
--------  ---------------------------------------------------------------------
0         1-9,11-19,21-29,31-39,41-49,51-59,61-4094
1         10,20,30
2         40,50,60
-------------------------------------------------------------------------------
SW1#show spanning-tree mst               

##### MST0    vlans mapped:   1-9,11-19,21-29,31-39,41-49,51-59,61-4094
Bridge        address 0019.5684.3700  priority      32768 (32768 sysid 0)
Root          address 0012.d939.3700  priority      8192  (8192 sysid 0)
              port    Fa0/16          path cost     0
Regional Root address 0012.d939.3700  priority      8192  (8192 sysid 0)
                                      internal cost 200000    rem hops 19
Operational   hello time 2 , forward delay 15, max age 20, txholdcount 6
Configured    hello time 2 , forward delay 15, max age 20, max hops    20

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg FWD 200000    128.15   P2p
Fa0/16           Root FWD 200000    128.18   P2p 

##### MST1    vlans mapped:   10,20,30
Bridge        address 0019.5684.3700  priority      8193  (8192 sysid 1)
Root          this switch for MST1

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Desg FWD 200000    128.15   P2p
Fa0/16           Desg FWD 200000    128.18   P2p 

##### MST2    vlans mapped:   40,50,60
Bridge        address 0019.5684.3700  priority      32770 (32768 sysid 2)
Root          address 001e.bdaa.ba80  priority      8194  (8192 sysid 2)
              port    Fa0/13          cost          200000    rem hops 19

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Fa0/13           Root FWD 200000    128.15   P2p
Fa0/16           Altn BLK 200000    128.18   P2p 

SW1#show spanning-tree mst interface fastEthernet 0/13

FastEthernet0/13 of MST0 is designated forwarding
Edge port: no             (default)        port guard : none        (default)
Link type: point-to-point (auto)           bpdu filter: disable     (default)
Boundary : internal                        bpdu guard : disable     (default)
Bpdus sent 561, received 544

Instance Role Sts Cost      Prio.Nbr Vlans mapped
-------- ---- --- --------- -------- -------------------------------
0        Desg FWD 200000    128.15   1-9,11-19,21-29,31-39,41-49,51-59
                                     61-4094
1        Desg FWD 200000    128.15   10,20,30
2        Root FWD 200000    128.15   40,50,60

SW1#show spanning-tree mst interface fastEthernet 0/16

FastEthernet0/16 of MST0 is root forwarding
Edge port: no             (default)        port guard : none        (default)
Link type: point-to-point (auto)           bpdu filter: disable     (default)
Boundary : internal                        bpdu guard : disable     (default)
Bpdus sent 550, received 1099

Instance Role Sts Cost      Prio.Nbr Vlans mapped
-------- ---- --- --------- -------- -------------------------------
0        Root FWD 200000    128.18   1-9,11-19,21-29,31-39,41-49,51-59
                                     61-4094
1        Desg FWD 200000    128.18   10,20,30
2        Altn BLK 200000    128.18   40,50,60

Note a few things here. The cost values are much higher than the default STP costs, and MSTIx is called MSTx (e.g. IST is MST0). Aside from that, note the term “Regional Root” which is to be explained in details in Part 2. As for this part, you can see that configuration MSTP inside a region is pretty simple. You just need to execute some caution, when filtering and mapping VLANs, but if you plan logical topologies in advance this should not cause any problems.

About Petr Lapukhov, 4xCCIE/CCDE:

Petr Lapukhov's career in IT begain in 1988 with a focus on computer programming, and progressed into networking with his first exposure to Novell NetWare in 1991. Initially involved with Kazan State University's campus network support and UNIX system administration, he went through the path of becoming a networking consultant, taking part in many network deployment projects. Petr currently has over 12 years of experience working in the Cisco networking field, and is the only person in the world to have obtained four CCIEs in under two years, passing each on his first attempt. Petr is an exceptional case in that he has been working with all of the technologies covered in his four CCIE tracks (R&S, Security, SP, and Voice) on a daily basis for many years. When not actively teaching classes, developing self-paced products, studying for the CCDE Practical & the CCIE Storage Lab Exam, and completing his PhD in Applied Mathematics.

Find all posts by Petr Lapukhov, 4xCCIE/CCDE | Visit Website


You can leave a response, or trackback from your own site.

56 Responses to “MSTP Tutorial Part I: Inside a Region”

 
  1. Rack009 says:

    Are you Einstein in disguise?

    An amazing article full of purpose and hard hitting facts. Arise Sir Petr.

  2. Issa says:

    I don’t have enough thanks for your great article. I beleive it is the time now to start publishing your own books

    Regards

  3. Steve says:

    Thank you very much =)

  4. apep says:

    Hello Petr,

    I think it could be a typo:

    “As for IST, let’s make SW1 the root switch for it (though it’s not recommended to assign root roles to access switches).”

    However SW3 is configured as a root for IST and this status is indicated properly in the outputs then, but your articles are always outstanding :)

  5. To: apep
    Thanks for noting the typo! Fixed that :) Per the diagram and the show commands output SW3 is the root for IST (MIST0).

  6. Ferret999 says:

    Petr what a brilliant article, I cannot wait for part 2.

  7. PD says:

    This explanation was awesome… Thanks a lot.. can we have the part two’s explanation.

    Pd..

  8. asap says:

    Might be another small typo:
    “Of course, other instances also exit, and they are called MSTIs – multiple spanning tree instances.”

    Should I read “exit” as “exist” ..

  9. asap says:

    Hello Petr,

    Does each MSTI have to include all the switches within the region? Can a port be in a MSTI but not belong to any VLANs? thanks a lot.

  10. valereo says:

    Why in the output of “SW1#show spanning-tree mst” there is a ‘path cost 0′ ??

  11. shams shah says:

    Hello,Finally i reached on ur page while searching a for a small question but intresting that when STP (spanning tree protocol) and UTP techniques started ???
    please write me as earlly as possible if any body knows that..i shal be veryy thankfull to you all

  12. Reinier Hamstra says:

    Hi Petr,

    First off, nice article! Secondly a question that I have not been able to find any real information but which relates to a remark you made in this article, nl:

    “Just as a side note, MSTP does not use MaxAge timer to age out old information, like RSTP/STP do”

    I know how STP uses Max-age, but am at a bit of a loss when it comes to the Max-age use within RSTP, since it uses an entirely different mechanism for convergence. I am assuming this has something to do with connectivity scenarios involving non-RSTP connected bridges? A pointer in the right direction would be very much appreciated!

    Thanks,
    Reinier

  13. NTllect says:

    Petr, excellent series of articles about MSTP! Thank you.

    Just one question: you wrote that “Just as a side note, MSTP does not use MaxAge timer to age out old information, like RSTP/STP do. Instead of this, IST BDPUs has special field called MaxHops.
    I suspect that default hop-count is equal to 20, like 20 seconds, right? And how can we change this value? Thanks.

  14. [...] reading any further, make sure you read and fully understand the first part of MSTP overview: MSTP Tutorial Part I: Inside a Region. The information there is critical to understand the new [...]

  15. To:NTllect

    Yes, the starting Hop Count value is 20 hops. You can change the hop-count value on the CIST Regional root bridge and it plays the role equivalent to MaxAge used by RSTP/STP.

    However the use of HopCount is sligly different: a switch simply reject any root bridge information if it notices that HopCount drops to zero.

    The use of MaxAge is more complicated. Along with Message Age it dictates the amount of time that the received port should wait for update STP information. MaxAge is still used by MSTP on the boundary with STP/RSTP domains.

    However, RSTP uses the MaxAge and Message Age simply as hop counts, without relying on information timeout procedure (with except to the cases where RSTP emulates STP).

  16. Ricardo Varela says:

    Petr,

    This is an excellent explanation, thanks a lot for sharing this with us. :)

    ..:: rick ::..

  17. Vasanth says:

    Nice Article, Thanks a lot for writing in simple understandable terms with scenarios.
    Keep going …

  18. Leon says:

    Great thanks from Leon, Guangzhou, China!

  19. Debashis says:

    Nice Article. Thanks.

    Debashis, Bangladesh

  20. Jaxon says:

    Very nice explanation – thanks for sharing.

  21. shweta m d says:

    Thanks a lot!!!
    A very valuable info, i mus say:)

  22. Arun Saha says:

    Very good tutorial. The best MST tutorial I have been able to find on the net.

    2nd part: http://blog.internetworkexpert.com/2008/09/24/mstp-tutorial-part-ii-outside-a-region/

  23. Kamran says:

    HI Petr,

    awesome stuff. Few questions for you:
    1) is the region name and revision mandatory to be defined? what if the task doesn’t specify?
    2) can you explain in a bit more detail how hop-count replaces maxage timer?

    thanks
    Kamram

  24. vinaya surya says:

    Very good article on MSTP . Thank you .

  25. John Harr says:

    A fantastic article, well done Petr!!

  26. reddy says:

    i am not able to understand the region based mstp

  27. reddy says:

    right now i am working on mstp if any material on region based mstp is their please send me

  28. Sunny says:

    Peter awesome work , after watching the open lecture series and read your article like 3 times lol now MSTP is making sense too me , very very nice explantion please tell us part 2 is released or will be release if yes then please tell us

  29. shivlu jain says:

    Really a good and self explanatory document. Thanks

  30. Mandy Pagsanjan says:

    Absolutely Great…very informative…
    Just wondering if you can explain how it will works with a five switches in a ring ropology would be great..Thanks.

  31. Naidu says:

    Excellent work & thanks for hardworking :)

  32. Sorin says:

    What happens in the configuration example you also manually prune the vlan on the trunk from SW3 ?
    If the vlans of an MSTI are not mapped to a trunk will that instance be active on that port ???

  33. Gheb says:

    Excellent work !!

    Could you please be so kind to make a tutorial including basic steps on how to make tutorials like yours :) ) ( really , lot’s of “dudes” like me want to share information but they think they do not have teaching experience)

  34. [...] MSTP Tutorial Part 1: Inside a Region [...]

  35. ambuj says:

    fairly decent and concise explanation .

  36. Matti Pentti says:

    Great article, easy to follow and understand.
    Thank you!

    One thing.. “s/BDPU/BPDU” ?
    http://www.googlefight.com/index.php?lang=en_GB&word1=BDPU&word2=BPDU
    :)

  37. Brilliant. So far, the absolute best explaination of MSTP I’ve seen. Great work!

  38. sumeet singh says:

    excellent its very helpful.

    thanx

  39. Nguyen Minh HIeu says:

    Hi Petr,

    I have a question about MSTP port cost and port priority. Which has a higher priority for setting a port in a forwarding state. If we have 2 ports, one has lower port cost and lower priority number than the other port, so which port will be set in a forwarding state?

    Thank for your help.

  40. Amr says:

    Excellent Work Very Helpful thanks so much

  41. Thabet says:

    WOW That is awesome !! . Wish you Success

  42. insinkerator…

    [...]MSTP Tutorial Part I: Inside a Region | CCIE Blog[...]…

  43. Ale says:

    I have a question about mstp/rstp. To enable the rstp protocol I need to configure mstp (thais has embedded the code for rstp). Now I create different instances inside one region each one with a set of vlans. I know that if I wish, in future, add a new vlan inside one of the instance there will be a topological recalculation and my traffic will be affected.
    So I decide to use mstp and the rstp logic abdicating to use different instances (i.e. no different stp path)
    Now I have 2 choices:
    1 I do not consider to configure different instances and I will use only instance 0 where there are all the vlans by default (In this case I don’ t use the distribution of the paths)

    2 (similar to 1) create only one instance (for sample instance nr 1) and put all vlan inside it.

    I think these are all possible to do but I don’ t know the scalability of these choices.
    For sample: I have only instance 0 with all vlan (1-4094) inside it and really i have configured 300 vlans.

    Regards

    Ale

  44. Nicolas Gachet says:

    Hello,
    thanks for this article, very usefull. could you confirm that when a STP recalculation is done on an instance only Mac adress from the vlan instance are flushed ?
    thanks in advance

  45. Mohit Bhalla says:

    very nice post sir, totally awesome … :D
    so happy after understanding it :D

  46. Agustin says:

    If I have:
    a region named region1 (all switches same region name)
    all vlans mapped to instance 0 (al switches same mapping)
    different vlans created in every switch.

    Will they be in the same region?

    Thanks

  47. M.Arsalan says:

    Great Article: i visit many web: But i don’t understand MST: But I really thankful to u: u Absolutely Rock!!!! buddy

  48. Pablo says:

    Great article, specially the example, help me so much. Thank you¡¡¡

  49. Vishesh says:

    Excellent artical

  50. Dan says:

    Thank you very much Peter. Your article helped me a lot in a finding of solution how to redesign my network. Its great to see that somebody publish such articles for free…

  51. Ashu says:

    If 3 switches are already in loop and then enable MSTP is that a valid scenario?
    What should MSTP would do in such scenario?

 

Leave a Reply

Categories

CCIE Bloggers