Sep
20

In this final part of our blog series on QoS with the PIX/ASA, we examine the remaining two tools that we find on some devices - traffic shaping and traffic policing.

Traffic Shaping

Traffic shaping on the security appliance allows the device to limit the flow of traffic. This mechanism will buffer traffic over the "speed limit" and attempt to send the traffic later. On the 7.x security device, traffic shaping must be applied to all outgoing traffic on a physical interface. Shaping cannot be configured for certain types of traffic. The shaped traffic will include traffic passing though the device, as well as traffic that is sourced from the device.

In order to configure traffic shaping, use the class-default class and apply the shape command in Policy Map Class Configuration mode. This class-default class is created automatically for you by the system. It is a simple match any class map that allows you to quickly match all traffic. Here is a sample configuration:

pixfirewall(config-pmap)#policy-map PM-SHAPER
pixfirewall(config-pmap)# class class-default
pixfirewall(config-pmap-c)# shape average 2000000 16000
pixfirewall(config-pmap-c)# service-policy PM-SHAPER interface outside

Verification is simple. You can run the following to confirm your configuration:

pixfirewall(config)# show run policy-map
!
policy-map PM-SHAPER
 class class-default
shape average 2000000 16000
!

Another excellent command that confirms the effectiveness of the policy is:

pixfirewall(config)# show service-policy shape
Interface outside:
 Service-policy: PM-SHAPER
Class-map: class-default
shape (average) cir 2000000, bc 16000, be 16000
Queueing
     queue limit 64 packets
 (queue depth/total drops/no-buffer drops) 0/0/0
      (pkts output/bytes output) 0/0

Traffic Policing

With a policing configuration, traffic that exceeds the "speed limit" on the interface is dropped. Unlike traffic shaping configurations on the appliance, with policing you can specify a class of traffic that you want the policing to effect. Let's examine a traffic policing configuration. In this configuration, we will limit the amount of Web traffic that is permitted in an interface.

pixfirewall(config)# access-list AL-WEB-TRAFFIC permit tcp host 192.168.1.110 eq www any
pixfirewall(config-if)# class-map CM-POLICE-WEB
pixfirewall(config-cmap)# match access-list AL-WEB-TRAFFIC
pixfirewall(config-cmap)# policy-map PM-POLICE-WEB
pixfirewall(config-pmap)# class CM-POLICE-WEB
pixfirewall(config-pmap-c)# police input 1000000 conform-action transmit exceed-action drop
pixfirewall(config-pmap-c)# service-policy PM-POLICE-WEB interface outside

Notice we can verify with similar commands that we used for shaping!

pixfirewall(config)# show run policy-map
!
policy-map PM-POLICE-WEB
 class CM-POLICE-WEB
  police input 1000000
!
pixfirewall(config)# show ser
pixfirewall(config)# show service-policy police
Interface outside:
  Service-policy: PM-POLICE-WEB
    Class-map: CM-POLICE-WEB
      Input police Interface outside:
        cir 1000000 bps, bc 31250 bytes
        conformed 0 packets, 0 bytes; actions:  transmit
        exceeded 0 packets, 0 bytes; actions:  drop
        conformed 0 bps, exceed 0 bps

I hope that you enjoyed this four part series on QoS on the PIX/ASA! Please look for other posts about complex configurations on the security appliances very soon. I have already been flooded with recommendations!

Happy Studies!

Sep
12

This blog is focusing on QoS on the PIX/ASA and is based on 7.2 code to be consistent with the CCIE Security Lab Exam as of the date of this post. I will create a later blog regarding new features to 8.X code for all of you non-exam biased readers :-)

NOTE: We have already seen thanks to our readers that some of these features are very model/license dependent! For example, we have yet to find an ASA that allows traffic shaping. 

One of the first things that you discover about QoS for PIX/ASA when you check the documentation is that none of the QoS tools that these devices support are available when you are in multiple context mode. This jumped out at me as a bit strange and I just had to see for myself. Here I went to a PIX device, switched to multiple mode, and then searched for the priority-queue global configuration mode command. Notice that, sure enough, the command was not available in the CUSTA context, or the system context.

pixfirewall# configure terminal
pixfirewall(config)# mode multiple
WARNING: This command will change the behavior of the device
WARNING: This command will initiate a Reboot
Proceed with change mode? [confirm]
Convert the system configuration? [confirm]
pixfirewall> enable
pixfirewall# show mode
Security context mode: multiple
pixfirewall# configure terminal        
pixfirewall(config)# context CUSTA
Creating context 'CUSTA'... Done. (2)
pixfirewall(config-ctx)# context CUSTA
pixfirewall(config-ctx)# config-url flash:/custa.cfg
pixfirewall(config-ctx)# allocate-interface e2 
pixfirewall(config-ctx)# changeto context CUSTA
pixfirewall/CUSTA(config)# pri?     
configure mode commands/options:
privilege
pixfirewall/CUSTA# changeto context system
pixfirewall# conf t
pixfirewall(config)# pr?
configure mode commands/options:
privilege 

OK, so we have no QoS capabilities when in multiple context mode. :-| What QoS capabilities do we possess on the PIX/ASA when we are behaving in single context mode? Here they are:

  • Policing – you will be able to set a “speed limit” for traffic on the PIX/ASA. The policer will discard any packets trying to exceed this rate. I always like to think of the Soup Guy on Seinfeld with this one - "NO BANDWIDTH FOR YOU!" 
  • Shaping – again, this tool allows you to set a speed limit, but it is “kinder and gentler”. This tool will attempt to buffer traffic and send it later should the traffic exceed the shaped rate.
  • Priority Queuing – for traffic (like VoIP that rely hates delays and variable delays (jitter), the PIX/ASA does support priority queuing of that traffic. The documentation refers to this as a Low Latency Queuing (LLQ).

Now before we get too excited about these options for tools, we must understand that we are going to face some pretty big limitations with their usage compared to shaping, policing, and LLQ on a Cisco router. We will detail these limitations in future blogs on the specific tools, but here is an example. We might get very excited when we see LLQ in relation to the PIX/ASA, but it is certainly not the LLQ that we are accustomed to on a router. On a router, LLQ is really Class-Based Weighted Fair Queuing (CBWFQ) with the addition of strict Priority Queuing (PQ). On the PIX/ASA, we are just not going to have that type of granular control over many traffic forms. In fact, with the standard priority queuing approach on the PIX/ASA, there is a single LLQ for your priority traffic and all other traffic falls into a best effort queue.

If you have been around QoS for a while, you are going to be very excited about how we set these mechanisms up on the security appliance. We are going to use the Modular Quality of Service Command Line Interface (MQC) approach! The MQC was invented for CBWFQ on the routers, but now we are seeing it everywhere. In fact, on the security appliance it is termed the Modular Policy Framework. This is because it not only handles QoS configurations, but also traffic inspections (including deep packet inspections), and can be used to configure the Intrusion Prevention and Content Management Security Service Modules. Boy, the ole’ MQC sure has come a long way.

While you might be frustrated with some of the limitations in the individual tools, at least there are a couple of combinations that can feature the tools working together. Specificaly, you can:

  • Use standard priority queueing (for example for voice) and then police for all of the other traffic.
  • You can also use traffic shaping for all traffic in conjunction with hierarchical priority queuing for a subset of traffic. Again, in later blogs we will educate you more fully on each tool.

Thanks for reading and I hope you are looking forward to future blog entries on QoS with the ASA/PIX.

Sep
11

People are often confused with per-VLAN classification, policing and marking features in the Catalyst 3550 and 3560 models. The biggest problem is lack of comprehensive examples in the Documentation CD. Let's quickly review and compare traffic policing features available on both platforms. The material below is a condensed excerpt of several Catalyst QoS topics covered in the “QoS” section of our IEWB VOL1 V5. You will find more in-depth explanations and large number of simulation-based verifications in the upcoming section of the workbook.

Per-Port QoS ACL based classification in the 3550 and 3560

This feature works the same in both switches. Even though it is common to see this method named as “QoS ACLs” classification, it actually uses MQC syntax with class-maps and policy-maps. The term “QoS ACL” comes from the Catalyst OS PFC-based QoS model where you apply QoS ACLs directly to ports. In IOS-based switches, you use the familiar MQC syntax.

Look at the following example. Here we mark IPX packets (classified based on SNAP Protocol ID) with CoS value of 5, ICMP packets with IP Precedence 2 and ICMP packets with DSCP value of 24 (CS3). All remaining traffic, be it IP or non-IP, matches class-default and the system trusts the CoS values in the frame header. If a packet matches “class-default” but has no CoS value in the header, the system uses the default value from the interface, specified with mls qos cos 1

Example 1:

!
! Access-Lists. IPX SNAP Protocol ID is 0x8137
!
mac access-list extended IPX
permit any any 0x8137 0x0
!
ip access-list extended ICMP
permit icmp any any
!
ip access-list extended TELNET
permit tcp any any eq telnet
permit tcp any eq telnet any

!
! Class Maps
!
class-map match-all TELNET
match access-group name TELNET
!
class-map match-all ICMP
match access-group name ICMP
!
class-map match-all IPX
match access-group name IPX

!
! Note that we set DSCP under non-IP traffic class. The switch computes
! the respective CoS value using DSCP-to-CoS table.
!
policy-map CLASSIFY
class IPX
set dscp ef
class ICMP
set dscp cs3
class TELNET
set precedence 2
class class-default
trust cos

!
! Apply the policy
!
interface FastEthernet 0/6
mls qos cos 1
service-policy input CLASSIFY

Note the use of set dscp command on non-IP packet. You cannot set CoS value directly, with one special case for the 3550 switches. Therefore, you should use the command set dscp for both IP and non-IP packets if you want to change the CoS field.

You may apply the same policy in the 3550 models, but with one special limitation. The “class-default” should be split into two user-defined classes matching IP and non-IP traffic. I never seen "class-default" working properly in the 3550 models, so if you have a working example with verifications, please let me know :). Here is a workaround, that shows how to police both IP and non-IP traffic simultaneously in the 3550 model:

Example 2:

!
! All IP traffic
!
ip access-list standard IP_ANY
permit any
!
! All non-IP traffic
!
mac access-list extended MAC_ANY
permit any any 0x0 0xFFFF
!
class-map IP_ANY
match access-group name IP_ANY
!
class-map MAC_ANY
match access-group name MAC_ANY
!
mls qos aggregate-policer AGG256 256000 32000 exceed-action drop
!
policy-map AGGREGATE_LIMIT
class IP_ANY
police aggregate AGG256
class MAC_ANY
police aggregate AGG256

Per-Port Per-VLAN Classification in the 3550 switches

This feature is unique to the 3550 switches. You may configure traffic classification that takes in account VLAN for input traffic. Unfortunately, you can only apply it at a port-level. There is no option to classify the traffic entering VLANs on any port using a single policy, like there is in the 3560 model. Note that you can apply per-port per-VLAN policy on a trunk or an access port. Of course, in this case the access port must be in a VLAN matched by the policy-map. Here is an example of per-VLAN classification and marking in the 3550 models. What it does, is sets different DSCP and CoS values for IPX and ICMP packets coming from VLAN146 only. Other VLANs on a trunk port will remain intact.

Example 3:

mls qos
!
! Acess-Lists
!
ip access-list extended ICMP
permit icmp any any
!
mac access-list extended IPX
permit any any 0x8137 0x0

!
! Second-level class-maps
!
class-map ICMP
match access-group name ICMP
!
class-map IPX
match access-group name IPX

!
! First-level class-maps
!
class-map VLAN_146_ICMP
match vlan 146
match class-map ICMP
!
class-map VLAN_146_IPX
match vlan 146
match class-map IPX

!
! Policy-maps. Note that we set DSCP for IPX packets
! In fact, this is the internal DSCP value, not the IP DSCP.
! You cannot set CoS directly – only when trusting DSCP.
! DSCP 16 translates to CoS 2 by the virtue of the
! default DSCP-to-CoS table.
!
policy-map PER_PORT_PER_VLAN
class VLAN_146_ICMP
set dscp CS3
class VLAN_146_IPX
set dscp 16

!
! Apply the policy-map
!
interface FastEthernet 0/4
service-policy input PER_PORT_PER_VLAN

Note the special syntax used by this features. All classes in the policy-map must use match vlan statement on their first line and match another class in their second line. The second-level class-map should use only one statement, e.g. match access-group name. The access-group itself may have variable number of entries.

Per-Port Per-VLAN Policing in the 3550

The following example is just a demonstration of applying policing inside VLAN-based policy-map in the 3550. This feature is a regular policer, just the same you can use in a per-port policy-map. You can either drop packet or re-mark the internal DSCP. With the 3550 it is even possible to use aggregate policer shared between several VLANs on a single port.

The example remarks IP traffic exceeding the rate of 128Kbps with DSCP value of CS1. Prior to metering, the policy map marks traffic with AF11. The policy map marks non-IP traffic with a CoS value of three, and drops packets that exceed 256Kbps. All classification and policing procedures apply only to packets in VLAN146.

Example 4:

mls qos
mls qos map policed-dscp 10 to 8

!
! Acess-Lists
!
ip access-list extended IP_ANY
permit IP any any
!
mac access-list extended NON_IP_ANY
permit any any 0x0 0xFFFF

!
! Second-level class-maps
!
class-map IP_ANY
match access-group name IP_ANY
!
class-map NON_IP_ANY
match access-group name NON_IP_ANY

!
! First-level class-maps
!
class-map VLAN_146_IP
match vlan 146
match class-map IP_ANY
!
class-map VLAN_146_NON_IP
match vlan 146
match class-map NON_IP_ANY

!
! Since we are not limited, we use any burst values
! Policy map matches first-level class-maps
!
policy-map POLICE_PER_PORT_PER_VLAN
class VLAN_146_IP
set dscp af11
police 128000 16000 exceed-action policed-dscp-transmit
class VLAN_146_NON_IP
set dscp cs3
police 256000 32000

!
! Apply the policy-map
!
interface FastEthernet 0/4
service-policy input POLICE_PER_PORT_PER_VLAN

Per-VLAN Classification in the 3560

Switch-wide per-VLAN classification is feature specific to the 3560. You can apply a service-policy to SVI, listing class-map and respective mark actions in the policy-map. Note that you cannot usethe police command in a service-policy applied to SVI. All packets in the respective VLAN entering the switch on the ports enabled for VLAN-based QoS are subject to inspection by service-policy on the SVI. The effect is that QoS marking policy is now switch-wide on all ports (trunks or access) having the respective VLAN activated (allowed and STP forwarding state).

The following example sets DSCP EF in the TCP packets entering VLAN146 and CoS 4 in the IPX packet transiting the switch across VLAN 146. Note that all ports with VLAN146 that are configured for VLAN QoS will be affected by this configuration.

Example 5:

mls qos
!
! Enable VLAN-based QoS on interfaces
!
interface FastEthernet 0/1
mls qos vlan-based

!
! Enable VLAN-based QoS on the trunks
!
interface range FastEthernet 0/13 - 21
mls qos vlan-based

!
! Access-Lists
!
ip access-list extended TCP
permit tcp any any
!
mac access-list extended IPX
permit any any 0x8137 0x0

!
! First-level class-maps
!
class-map TCP
match access-group name TCP
!
class-map IPX
match access-group name IPX

!
! VLAN-based policy-map
!
policy-map PER_VLAN
class TCP
set dscp ef
class IPX
set dscp 32

!
! Apply the policy-map to SVI
!
interface VLAN 146
service-policy input PER_VLAN

As mentioned above, you cannot use policing in SVI-level policy map. This eliminates the possibility of having policer aggregating traffic rates for all ports in a single VLAN. However, you can still police per-port per-VLAN in the 3560 using the following feature.

Per-Port Per-VLAN Policing in the 3560

This feature uses second-level policy-maps. The second-level policy-map must list class-maps satisfying specific restrictions. The only “match” criterion supported in those class-maps is match input-interface. You can list several interfaces separated by spaces, or use interface ranges, separating interfaces in a range by hyphen. The only action supported in the second-level policy-map is the police command. As usual, you can drop exceeding traffic or configure traffic remarking using policed DSCP map. The police action applies individually to every port in the range.

You cannot use aggregate policers in the second-level policy-maps. Another restriction – if you apply a second-level policy-map (interface-level map) inside “class-default” it will have no effect. You need to apply the second-level policy-map inside user-defined classes.

The following example restricts IP traffic rate in VLAN146 on every trunk port to 128Kbps and limits the IP traffic in VLAN146 on the port connected to R6 to 256Kbps.

Example 6:

mls qos map policed-dscp 18 to 8

!
! For 2nd level policy you can only match input interfaces
!
class-map TRUNKS
match input-interface FastEthernet 0/13 - FastEthernet 0/21

!
! The second class-map matches a group of ports or a single port
!
class-map PORT_TO_R6
match input-interface FastEthernet 0/6

!
! IP traffic: ACL and class-map
!
ip access-list extended IP_ANY
permit ip any any
!
class-map IP_ANY
match access-group name IP_ANY

!
! Second-level policy-map may only police, but not mark
!
policy-map INTERFACE_POLICY
class TRUNKS
police 128000 16000 exceed policed-dscp-transmit
class PORT_TO_R6
police 256000 32000

!
! 1st level policy-map may only mark, not police
! VAN aggregate policing is not possible in the 3560
!
policy-map VLAN_POLICY
class IP_ANY
set dscp af21
service-policy INTERFACE_POLICY
!
interface Vlan 146
service-policy input VLAN_POLICY

!
! Enable VLAN-based QoS on the ports
!
interface FastEthernet 0/6
mls qos vlan-based

!
! Enable VLAN-based QoS
!
interface range FastEthernet 0/13 - 21
mls qos vlan-based

This is it for the comparison of most common policing features on the 3550 and 3560 platforms. Digging deeper in the areas of Catalyst QoS would probably takes us far beyond the size of a post the a normal human can read :)

Jul
03

This may seem to be a basic topic, but it looks like many people are still confused by the difference between those two concepts. Let us clear this confusion at once!

Shaping vs Policing

Look at the diagram above. Both router links are clocked at 128Kbps, and the test packet flow has packet size of 1000 bytes each, being sent at a sustained rate of 16 packets per second, effectively saturating the 128Kbps link. Consider what happens when we shape the flow down to 64Kbps. Egress packets are also serialized at 128Kbps - therefore the shaper needs to buffer and delay packets to obtain the target average rate of 64Kbps. Shaper performs that by delaying every burst each Tc interval. For this example, the Bc value (shaper burst) equals to packet size, so effectively every 1/16s interval egress link is busy and the next 1/16s interval it is idle. The average bps rate is total volume of (4*1000) divided by 1/2s (time to send) and multiplied by 8 (to get bps) yielding the result of 64000bps.

The summary points about shaping are as follows:

I) Shaper send Bc amount of data every Tc interval at physical port speed
II) Since shaper delays packets, it uses a queue to store them
III) Shaper queue may use different scheduling algorithms, e.g. WFQ, CBWFQ, FIFO
IV) Shaper unifies traffic flow and introduces delay, which may affect end-to-end QoS characteristics
V) Shaping is generally a WAN technology, used to share a multipoint interface bandwidth and/or compensate for speed differences between sites

On the other hand, policer behaves in a much simpler manner. It achieves the same average traffic rate by dropping a packet that could exceed the policed rate. The policer algorithm is simple: remember the last packet arrival time (“T0”), the current credit (“Cr0”) and “PolicerRate” constant (64Kpbs in our example). (There is also the “Bc” – burst size, but we will ignore it for a moment). When a new packet of size “S” arrives at a moment of time “T” the policer performs the following:

a) Calculate accumulated credit: Cr = Cr0 + (T-T0)*PolicerRate (note: Bc ignored here).
b) If (S <= Cr) than Cr0 = Cr – S and packet is admitted, since we have enough credit
c) Else packet is denied and credit remains the same: Cr0 = Cr.
d) Store the last packet arrival time: T0=T

This simple admission procedure allows for very efficient hardware implementations. Look at the above diagram again. For a sustained packet flow, policer drops every next packet, for it can’t accumulate 1000 bytes of credit during 1/16s (the packet arriving rate) since the “PolicerRate” is just 64000bps. Therefore, every 1/16s the policer is only able to accumulate 500 bytes of credit, and it takes 1/8 of a second to get enough credit to admit a packet.

Now, for the policer burst size. As we remember, with shaping Bc effectively defines the amount of data sent every Tc. With policing it’s purpose is different, however. Look at the diagram below:

Policing Burst

The flow no longer sustains. At one moment, the source is paused and then resumed. Policers were designed to take advantage of such irregular behavior. The long pause allows the policer to accumulate more credit and then use it to accept a “large” packet train at once. However, it can’t be allowed for the credit to grow in unbounded manner (e.g. 1 hours of pause between packets yielding very large credit). Therefore, a committed burst size is used by policers as follows:

a) Calculate new credit: Cr = Cr0 + (T-T0)*PolicerRate
b) If (Cr > Bc) then Cr = Bc
c) If (S <= Cr) then Cr0 = Cr – S and packet is admitted.
d) Else packet is denied and Cr0 = Cr.
e) Store the last packet arrival time: T0=T

The Bc constant limits the amount of credit a policer is allowed to accumulate during the idle periods. Obviously, you will never want to set up Bc lower than a network MTU, for this will prohibit any packet from passing admission. Note the following interesting relation: Tc=Bc/PolicerRate. This is sometimes called “averaging interval”. By the policer design, if you observe the policer traffic flow for “Tc” amount of time, you will never see average bitrate to go above “PolicerRate”. Note that policer “Tc” has nothing to do with “shaping” Tc, as they have very different purpose and meaning.

In summary, the key points about policing:

I) Policer uses a simple, credit based admission model
II) Policer never delays packets and never “absorbs” or smoothes packet bursts
III) Policers are usually used at the edge of a network to control packet admission
IV) Policers could be used in either ingress or egress direction

The last question – how would one calculate a Bc value for a policer? As you’ve seen, for a sustained traffic flow it does not matter what size of Bc to pick up – it does not affect the average packet rate. However, in real life, traffic flows are very irregular. If you pick Bc value too small, you may end up dropping too much packets. Obviously, this is bad for protocols like TCP, which consider a dropped packet to be a signal of congestion. Therefore, there exist some general rules of thumb to pick up Bc values based on policer rate. Generally, Bc should be no less than 1,5s*PolicerRate, but you should calculate the optimal value empirically, by running application tests.

Further Reading:
Comparing Traffic Policing and Traffic Shaping for Bandwidth Limiting

Jun
26

The goal of this article is to discuss how would the following configuration work in the 3560 series switches:

interface FastEthernet0/13
switchport mode access
load-interval 30
speed 10
srr-queue bandwidth shape 50 0 0 0
srr-queue bandwidth share 33 33 33 1
srr-queue bandwidth limit 20

Before we begin, let’s recap what we know so far about the 3560 egress queuing:

1) When SRR scheduler is configured in shared mode, bandwidth allocated to each queue is based on relative weight. E.g. when configuring "srr-queue bandwidth share 30 20 25 25" we obtain the weight sum as 30+20+25+25 = 100 (could be different, but it’s nice to reference to “100”, as a representation of 100%). Relative weights are therefore “30/100”, “20/100”, “25/100”, “25/100” and you can calculate the effective bandwidth *guaranteed* to a queue multiplying this weight by the interface bandwidth: e.g. 30/100*100Mbps = 30Mbps for the 100Mbps interface and 30/100*10Mbps=3Mbps for 10Mbps interface. Of course, the weights are only taken in consideration when interface is oversubscribed, i.e. experiences a congestion.

2) When configured in shaped mode, bandwidth restriction (policing) for each queue is based on inverse absolute weight. E.g. for “srr-queue bandwidth shape 30 0 0 0” we effectively restrict the first queue to “1/30” of the interface bandwidth (which is approximately 3,3Mbps for 100Mbps interface and approximately 330Kbps for a 10Mbps interface). Setting SRR shape weight to zero effectively means no shaping is applied. When shaping is enabled for a queue, SRR scheduler does not use shared weight corresponding to this queue when calculating relative bandwidth for shared queues.

3) You can mix shaped and shared settings on the same interface. For example two queues may be configured for shaping and others for sharing:

interface FastEthernet0/13
srr-queue bandwidth share 100 100 40 20
srr-queue bandwidth shape 50 50 0 0

Suppose the interface rate is 100Mpbs; then queues 1 and 2 will get 2 Mbps, and queues 3 and 4 will share the remaining bandwidth (100-2-2=96Mbps) in proportion “2:1”. Note that queues 1 and 2 will be guaranteed and limited to 2Mbps at the same time.

4) The default “shape” and “share” weight settings are as follows: “25 0 0 0” and “25 25 25 25”. This means queue 1 is policed down to 4Mbps on 100Mpbs interfaces by default (400Kbps on 10Mbps interface) and the remaining bandwidth is equally shared among the other queues (2-4). So take care when you enable “mls qos” in a switch.

5) When you enable “priority-queue out” on an interface, it turns queue 1 into priority queue, and scheduler effectively does not account for the queue’s weight in calculations. Note that PQ will also ignore shaped mode settings as well, and this may make other queues starve.

6) You can apply “aggregate” egress rate-limitng to a port by using command “srr-queue bandwidth limit xx” at interface level. Effectively, this command limits interface sending rate down to xx% of interface capacity. Note that range starts from 10%, so if you need speeds lower than 10Mbps, consider changing port speed down o 10Mbps.

How will this setting affect SRR scheduling? Remember, that SRR shared weights are relative, and therefore they will share the new bandwidth among the respective queues. However, shaped queue rates are based on absolute weights calculated off interface bandwidth (e.g. 10Mbps or 100Mbps) and are subtracted from interface “available” bandwidth. Consider the example below:

interface FastEthernet0/13
switchport mode access
speed 10
srr-queue bandwidth shape 50 0 0 0
srr-queue bandwidth share 20 20 20 20
srr-queue bandwidth limit 20

Interface sending rate is limited to 2Mbps. Queue 1 is shaped to 1/50 of 10Mps, which is 200Kbps of bandwidth. The remaining bandwidth 2000-200=1800Kbps is divided equally among other queues in proportion 20:20:20=1:1:1. That is, in case on congestion and all queues filled up, queue 1 will get 200Kbps, and queues 2-4 will get 600Kbps each.

Quick Questions and Answers

Q: How would I determine which queue will the packet go to? What if my packet has a CoS and DSCP value set at the same time?
A: That depends on what you are trusting at classification stage. If you trust CoS value, then QoS to Output Queue map will be used. Likewise, if you trust DSCP value, then DSCP to Output Queue map will determine the outgoing queue. Use “show mls qos map” commands to find out the current mappings.

Q: What if I’ve configured “shared” and “shaped” srr-queue settings for the same queue?
A: The shaped queue settings will override shared weight. Effectively, shared weight will also be exempted from SRR calculations. Note that in shaped mode queue is still guaranteed it’s bandwidth, but at the same time is not allowed to send above the limit.

Q: What if priority-queue is enabled on the interface? Can I restrict the PQ sending rate using “shaped” weight?
A: No you can’t. Priority-queue will take all the bandwidth if needed, so take care with traffic admission.

Q: How will a shaped queue compete with shared queues on the same interface?
A: Shared queues share the bandwidth remaining from the shaped queues. At the same time, shaped queues are guaranteed the amount of bandwidth allowed by their absolute weight.

Q: How is SRR shared mode different from WRR found in the Catalyst 3550?
A: SRR in shared mode essentially behaves similar to WRR, but is designed to be more efficient. Where WRR would empty the queue up to it’s credit in single run, SRR will take series of quick runs among all the queues, providing more “fair” share and smoother traffic behavior.

Verification scenario diagram and configs

3560 Egress Queuing

For the lab scenario, we configure R1, R3 and R5 to send traffic down to R2 across two 3560 switches saturating the link between them. All routers share one subnet 172.16.0.X/24 where X is the router number. SW1 will assign CoS/IP Precedence values of 1, 3 and 5 respectively to traffic originated by R1, R3 and R5. At the same time, SW1 will apply egress scheduling on it’s connection to SW2. R2’s function is to meter the incoming traffic, by matching the IP precedence values in packets. Note that SW2 has mls qos disabled by default.

We will use the default CoS to Output Queue mappings with CoS 1 mapped to Queue 2, CoS 3 mapped to Queue 3 and CoS 5 mapped to Queue 1. Note that by the virtue of the default mapping tables, CoS 0-7 map to IP Precedence 0-7 (which become overwritten), so we can match IP precedence’s on R2.

SW1#show mls qos maps cos-output-q
Cos-outputq-threshold map:
cos: 0 1 2 3 4 5 6 7
------------------------------------
queue-threshold: 2-1 2-1 3-1 3-1 4-1 1-1 4-1 4-1

SW1’s connection to SW2 is set to 10Mbps port rate, and further limited down to 2Mps by the use of “srr bandwidth limit” command. We will apply different scenarios and see how SRR behaves. Here comes the configurations for SW1 and R2:

SW1:
interface FastEthernet0/1
switchport mode access
load-interval 30
mls qos cos 1
mls qos trust cos
spanning-tree portfast
!
interface FastEthernet0/3
switchport mode access
load-interval 30
mls qos cos 3
mls qos trust cos
spanning-tree portfast
!
interface FastEthernet0/5
load-interval 30
mls qos cos 5
mls qos trust cos
spanning-tree portfast

R2:
class-map match-all PREC5
match ip precedence 5
class-map match-all PREC1
match ip precedence 1
class-map match-all PREC3
match ip precedence 3
!
!
policy-map TEST
class PREC5
class PREC3
class PREC1
!
access-list 100 deny icmp any any
access-list 100 permit ip any any
!
interface FastEthernet0/0
ip address 172.16.0.2 255.255.255.0
ip access-group 100 in
load-interval 30
duplex auto
speed auto
service-policy input TEST

To simulate traffic flow we execute the following command on R1, R3 and R5:

R1#ping 172.16.0.2 repeat 100000000 size 1500 timeout 0

In the following scenarios port speed is locked to 10Mbps and additionally port is limited to 20% of the bandwidth, with the effective sending rate of 2Mbps.

First scenario: Queue 1 (prec 5) is limited to 200Kbps while Queue 2 (prec 1) and Queue 3 (prec 3) share the remaining bandwidth in equal proportions:

SW1:
interface FastEthernet0/13
switchport mode access
load-interval 30
speed 10
srr-queue bandwidth share 33 33 33 1
srr-queue bandwidth shape 50 0 0 0
srr-queue bandwidth limit 20

R2#sh policy-map interface fastEthernet 0/0 | inc bps|Class
Class-map: PREC5 (match-all)
30 second offered rate 199000 bps
Class-map: PREC3 (match-all)
30 second offered rate 886000 bps
Class-map: PREC1 (match-all)
30 second offered rate 887000 bps
Class-map: class-default (match-any)
30 second offered rate 0 bps, drop rate 0 bps

Second Scenario: Queue 1 (prec 5) is configured as priority and we see it leaves other queues starving for bandwidth:

SW1:
interface FastEthernet0/13
switchport mode access
load-interval 30
speed 10
srr-queue bandwidth share 33 33 33 1
srr-queue bandwidth shape 50 0 0 0
srr-queue bandwidth limit 20
priority-queue out

R2#sh policy-map interface fastEthernet 0/0 | inc bps|Class
Class-map: PREC5 (match-all)
30 second offered rate 1943000 bps
Class-map: PREC3 (match-all)
30 second offered rate 11000 bps
Class-map: PREC1 (match-all)
30 second offered rate 15000 bps
Class-map: class-default (match-any)
30 second offered rate 0 bps, drop rate 0 bps

Third Scenario: Queues 1 (prec 5) and 2 (prec 1) are shaped to 200Kbps, while Queue 3 (prec 3) takes all the remaining bandwidth:

SW1:
interface FastEthernet0/13
switchport mode access
load-interval 30
speed 10
srr-queue bandwidth share 33 33 33 1
srr-queue bandwidth shape 50 50 0 0
srr-queue bandwidth limit 20

R2#sh policy-map interface fastEthernet 0/0 | inc bps|Class
Class-map: PREC5 (match-all)
30 second offered rate 203000 bps
Class-map: PREC3 (match-all)
30 second offered rate 1569000 bps
Class-map: PREC1 (match-all)
30 second offered rate 199000 bps
Class-map: class-default (match-any)
30 second offered rate 0 bps, drop rate 0 bps

Mar
03

The 3560 QoS processing model is tightly coupled with it’s hardware architecture borrowed from the 3750 series switches. The most notable feature is the internal switch ring, which is used for the switch stacking purpose. Packets entering a 3560/3750 switch are queued and serviced twice: first on the ingress, before they are put on the internal ring, and second on the egress port, where they have been delivered by the internal ring switching. In short, the process looks as follows:

[Classify/Police/Mark] -> [Ingress Queues] -> [Internal Ring] -> [Egress Queues]

For more insights and detailed overview of StackWise technology used by the 3750 models, visit the following link:

Cisco StackWise Technology White Paper

Next, it should be noted that the 3560 model is capable of recognizing and processing IPv6 pacekts natively – this feature affects some classification options. Another big difference is the absence of internal DSCP value and the use of QoS label for internal packet marking. This feature allows the 3560 switches to provide different classes of services to CoS or DSCP marked packets, by allowing them to be mapped to different queues/thresholds etc. Many other concepts and commands are different as well, as is some nomenclature (e.g. the number of the priority queue). We will try to summarize and analyze the differences under the following paragraphs. The biggest obstacle is absence of any good information source on the 3750/3560 switches QoS, besides the Cisco Documentation site, which has really poor documents regarding both models.

1. Queue Scheduler: SRR

One significant change is replacement of WRR scheduler with SRR, where latter stands for either Shared Round Robin or Shaped Round Robin – which are two different modes of operation for the scheduler. As we remember, what WRR does is simply de-queue the number of packets proportional to the weight assigned to a queue walking through queues in round-robin fashion. SRR performs similar in shared more: each output queue has a weight value assigned, and is serviced in proportion to the assigned weight when interface is under congestion.

The implementation details are not documented to open public by Cisco; however, so far we know that Shared Round Robin tries to behave more “fairly” than WRR does. While on every scheduler round WRR empties each queue up to the maximum number of allowed packet, before switching to the next queue, SRR performs a series of quick runs each round, deciding on whether to de-queue a packet from the particular queue (based on queue weights). In effect, SRR achieves more “fairness” on per-round basis, because it does not take the whole allowed amount each time it visits a queue. On the “long run” SRR and WRR behave similarly.

The shaped mode of SRR is something not available with WRR at all. With this mode, each queue has weight that determines the maximum allowed transmission rate for a queue. That is, interface bandwidth is divided in proportions of queue weights, and each queue is not allowed to send packets above this “slice” of common bandwidth. The details of the implementation are not provided, so we can only assume it’s some kind of effective policing strategy. Pay special attention, that SRR weight values are *absolute*, not relative. That is, the proportion of interface bandwidth give to a particular queue is "1/weight*{interface speed}. So for a weight of "20" the limit is "1/20*Speed" and equals to 5Mbps with 100Mbps interface. Also, by default, queue 1 has SRR shaped weight 1/25, so take care when you turn MLS QoS on.

An interface could be configured for SRR shared and shaped mode at the same time. However, SRR shaped mode always take preference over SRR shared weights. Finally, SRR provides support for priority queue but this queue is not subject to SRR shaping limits as well. The weight assigned to the priority queue is simply ignored for SRR calculations. Note that unlike the 3550, on the 3560, egress priority queue has queue-id 1, not 4. Here is an example:

!
! By settings shape weight to zero we effectively disable shaping for the particular queue
!
interface gigabitethernet0/1
srr-queue bandwidth shape 10 0 0 0
srr-queue bandwidth share 10 10 60 20
!
! As expedite queue (queue-id 1) is enabled, it’s weight is no longer honored by the SRR scheduler
!
priority-queue out

Another interesting egress feature is port rate-limit. SRR could be configured to limit the total port bandwidth to a percentage of physical interface speed – from 10% to 90%. Don’t configure this feature if there is no need to limit overall port bandwidth.

!
! Limit the available egress bandwidth to 80% of interface speed
!
interface gigabitethernet0/1
srr-queue bandwidth limit 80

Note that the resulting port rate depends on the physical speed of the port - 10/100/1000Mbps.

2. Egress Queues

The 3550 has four egress queues identified by their numbers 1-4 available to every interface,. Queue number 4 could be configured as expedite. On the 3560 side, we have the same four egress queues, but this time for expedite services we could configure queue-id 1.

With the 3550, for “class-to-queue” mapping only CoS to queue-id table is available on a per-port basis. Globally configurable DSCP to CoS mapping table is used to map an internal DSCP value to the equivalent CoS. As for the 3560 model, DSCP and CoS values are mapped to queue-ids separately. That means IP and non-IP traffic could be mapped/serviced separately. What if an IP packet comes with DSCP and CoS values both set? Then the switch will use the marking used for classification (e.g. CoS if trust cos or set cos were used) to assign the packet to a queue/threshold.

The 3550 supports WTD and WRED as queue drop strategy (the latter option available on Gigabit ports only). The 3560 model supports WTD as the only drop strategy, allowing for three per-queue drop thresholds. Only two of the thresholds are configurable – called explicit drop thresholds - and the third one is fixed to mark the full queue state (implicit threshold).

Finally, the mentioned mappings are configured for queue-id and drop threshold simultaneously in global configuration mode - unlike the 3550 where you configured CoS to Queue-ID and DSCP to Drop-Threshold mappings separately (and on per-interface basis).

!
! CoS values are mapped to 4 queues. Remember queue-id 1 could be set as expedite
!

!
! The next entry maps CoS value 5 to queue 1 and threshold 3 (100%)
!
mls qos srr-queue output cos-map queue 1 threshold 3 5
!
! VoIP signaling and network management traffic go to queue 2
!
mls qos srr-queue output cos-map queue 2 threshold 3 3 6 7
mls qos srr-queue output cos-map queue 3 threshold 3 2 4
mls qos srr-queue output cos-map queue 4 threshold 2 1
mls qos srr-queue output cos-map queue 4 threshold 3 0

!
! DSCP to queue/threshold mappings
!
mls qos srr-queue output dscp-map queue 1 threshold 3 40 41 42 43 44 45 46 47

mls qos srr-queue output dscp-map queue 2 threshold 3 24 25 26 27 28 29 30 31
mls qos srr-queue output dscp-map queue 2 threshold 3 48 49 50 51 52 53 54 55
mls qos srr-queue output dscp-map queue 2 threshold 3 56 57 58 59 60 61 62 63

mls qos srr-queue output dscp-map queue 3 threshold 3 16 17 18 19 20 21 22 23
mls qos srr-queue output dscp-map queue 3 threshold 3 32 33 34 35 36 37 38 39

!
! DSCP 8 is CS1 – Scavenger class, mapped to the first threshold of the last queue
!
mls qos srr-queue output dscp-map queue 4 threshold 1 8
mls qos srr-queue output dscp-map queue 4 threshold 2 9 10 11 12 13 14 15
mls qos srr-queue output dscp-map queue 4 threshold 3 0 1 2 3 4 5 6 7

Next, what we need to know is how to configure egress queue buffer spaces along with the threshold settings. With the 3550 we had two options: first, globally configurable buffer levels for FastEthernet ports, assigned per interface; second, one shared buffer pool for each of gigabit ports, with proportions (weights) configured on per-interface level too. With the 3560 things have been changed. Firstly, all ports are now symmetrical in their configurations. Secondly, a concept of queue-set has been introduced. A queue-set defines buffer space partition scheme as well as threshold levels for each of four queues. Only two queue-set are available in the system, and are configured globally. After a queue-set has been defined/redefined it could be applied to an interface. Of course, default queue-set configurations exist as well.

As usual, queue thresholds are defined in percentage of the allocated queue memory (buffer space). However, the 3560 switch introduced another buffer pooling model. There exist two buffer pools – reserved and common. Each interface has some buffer space allocated under the reserved pool. This reserved buffer space, allocated to an interface, could be partitioned between egress interface queues, by assigning a weight value (this resembles the gigabit interface buffer partitioning with the 3550):

!
! Queue-sets have different reserved pool partitioning schemes
! Each of four queues is given a weight value to allocate some reserved buffer space
!
mls qos queue-set output 1 buffers 10 10 26 54
mls qos queue-set output 2 buffers 16 6 17 61
!
interface gigabitEthernet 0/1
queue-set 1

What about the common buffer pool? Let’s take a look at the threshold setting command first:

mls qos queue-set output qset-id threshold queue-id drop-threshold1 drop-threshold2 reserved-threshold maximum-threshold

We see two explicitly configured thresholds 1 and 2 – just as with the 3550. However, there is one special threshold called reserved-threshold. What it does, is specifies how much of reserved buffer space is allocated to a queue – i.e. how much of reserved buffers are actually “reserved”. As we know, with the 3560 model every queue on every port has some memory allocated under reserved buffer pool. The reserved-threshold tells how much of this memory to allocate to a queue – from 1 to 100%. The unused amount of the reserved buffer space becomes available to over queues under the common buffer pool mentioned above. The common buffer pool could be used by any queue to borrow buffers above the queue’s reserved space. That allows to set drop-thresholds to values greater than 100%, meaning it’s allowable for queue to take more credit from common pool to satisfy it needs. The maximum-threshold specifies the “borrowing limit” – how much a queue is allowed to grow into the common pool.

Look at the command “mls qos queue-set output 1 threshold 1 138 138 92 138”. It says we shrink reserved buffer space for queue 1 to 92% sending the exceeding space to common pool. All the three drop-thresholds are set to 138% of the queue buffer space (allocated by buffer command), meaning we allow the queue to borrow from the common pool up to the levels specified. Drop thresholds may be set as large as 400% of the configured queue size.

Now we see that this model is a bit more complicated than it was with the 3550. We don’t know the actual sizes of reserved buffer pool, but we are allowed to specify the relative importance of each queue. Additionally, we may give up some reserved buffer space out to a common buffer pool to share with the other queues. Here is a detailed example from AutoQoS settings:

!
! Set thresholds for all four queues in queue-set 1
!
mls qos queue-set output 1 threshold 1 138 138 92 138
mls qos queue-set output 1 threshold 2 138 138 92 400
mls qos queue-set output 1 threshold 3 36 77 100 318
mls qos queue-set output 1 threshold 4 20 50 67 400

!
! Set thresholds for all four queues in queue-set 2
!
mls qos queue-set output 2 threshold 1 149 149 100 149
mls qos queue-set output 2 threshold 2 118 118 100 235
mls qos queue-set output 2 threshold 3 41 68 100 272
mls qos queue-set output 2 threshold 4 42 72 100 242

interface gigabitEthernet 0/1
queue-set 1
!
interface gigabitEthernet 0/2
queue-set 2

An finally, if you are not sure what you’re doing, don’t play with thresholds and buffer space partitioning – leave the to the default values, or use AutoQoS recommendations. So says the DocCD!

3. Ingress Queues

The unique feature of the 3560 switches is possibility to configure two ingress queues on every port. Of these queues, you may configure any to be an expedite queue, with queue 2 being the default. The priority queue is guaranteed access to the internal ring when the ring is congested. SRR only serves the ingress queues in shared mode. All the weights are configured globally, not per-interface as it was with the egress queues:

!
! Disable the default priority queue and share the bandwidth on the ring in 1/4 proportion
!
mls qos srr-queue input priority-queue 2 bandwidth 0
mls qos srr-queue input bandwidth 25 75

The UniverCD has a very shallow description of how actually SRR algorithm serves the two queues, when one of them is configured as expedite. So far, it looks like that you should configure one queue as expedite, assign the priority bandwidth value to it (from 1 to 40% of the internal ring bandwidth), and also assign the bandwidth values to both of the ingress queues as usual:

mls qos srr-queue input priority-queue 1 bandwidth 10
mls qos srr-queue input bandwidth 50 50

What’s that supposed to mean? Seems like queue 1 is partially serviced as an expedite queue, up the limit set by priority-queue bandwidth. As this counter exhausts, it is then being served on par with the non-expedite queue, using the bandwidth weights assigned. With the example about that mean we have 10% of bandwidth dedicated to priority queue services. As soon as this counter exhausts (say on a per-round basis) SRR continues to service both ingress queues using the remaining bandwidth counter (90%) shared in the proportions of the weights assigned (50 & 50) – that means 45% of the ring bandwidth to each of the queues. Overall, it looks like SRR simulates a policer for priority queue, but rather than dropping the traffic it simply changes the scheduling mode, until enough credits are accumulated to start expedite services again. Now go figures how to use that in real life! Too bad Cisco does no give out to public any internal details on their SRR implementation.

Now, two things remain to consider for the ingress queues: class mappings to the queues, and buffer/threshold settings. The class mapping uses the same syntax as for the egress queues, allowing configuring global mapping of CoS and DSCP do queue-ids and thresholds.

!
! CoS to queue-id/threshold
!
mls qos srr-queue input cos-map queue 1 threshold 3 0
mls qos srr-queue input cos-map queue 1 threshold 2 1
mls qos srr-queue input cos-map queue 2 threshold 1 2
mls qos srr-queue input cos-map queue 2 threshold 2 4 6 7
mls qos srr-queue input cos-map queue 2 threshold 3 3 5

!
! DSCP to queue-id/threshold
!
mls qos srr-queue input dscp-map queue 1 threshold 2 9 10 11 12 13 14 15
mls qos srr-queue input dscp-map queue 1 threshold 3 0 1 2 3 4 5 6 7
mls qos srr-queue input dscp-map queue 1 threshold 3 32
mls qos srr-queue input dscp-map queue 2 threshold 1 16 17 18 19 20 21 22 23
mls qos srr-queue input dscp-map queue 2 threshold 2 33 34 35 36 37 38 39 48
mls qos srr-queue input dscp-map queue 2 threshold 2 49 50 51 52 53 54 55 56
mls qos srr-queue input dscp-map queue 2 threshold 2 57 58 59 60 61 62 63
mls qos srr-queue input dscp-map queue 2 threshold 3 24 25 26 27 28 29 30 31
mls qos srr-queue input dscp-map queue 2 threshold 3 40 41 42 43 44 45 46 47

As it seems reasonable, the classification option (CoS, DSCP) used determines that mapping table for ingress queue.

The buffer partitioning is pretty simple – there is no common pool, and you only specify the relative weight for every queue. Thresholds are also simplified – you configure just two too of them, in percentage of queue size. The third threshold is implicit, as usual.

!
! Thresholds are set to 8% and 16% for queue 1; 34% and 66% for queue-2
! Buffers are partitioned in 67/33 proportion
!
mls qos srr-queue input threshold 1 8 16
mls qos srr-queue input threshold 2 34 66
mls qos srr-queue input buffers 67 33

So far we covered enough for a single post. In the next post we will discuss how classification policing and marking techniques differ between the 3550 and 3560 models.

Feb
23

QoS features available on Catalyst switch platforms have specific limitations, dictated by the hardware design of modern L3 switches, which is heavily optimized to handle packets at very high rates. Catalyst switch QoS is implemented using TCAM (Ternary Content Addressable Tables) - fast hardware lookup tables - to store all QoS configurations and settings. We start out Catalyst QoS overview with the old, long time available in the CCIE lab, the Catalyst 3550 model.


To begin with, the Catalyst 3550 sees the world as consisting of IP and non-IP (e.g. ARP, IPv6, IPX) packets – the distinction being made based on the Ethertype value in incoming Ethernet frame. This peculiarity is due to the fact that 3550 is a Layer 3 switch, designed to route IP packets in addition to the regular L2 switching. The feature has some impact on QoS classification options of 3550 as we shall see later.

The Catalyst 3550 QoS model is based on Diff-Serv architecture, where each packet is classified and marked, to encode the class value. Packets are given a per-hop treatment at every network node, based on their class (marking). With Diff-Serv architecture, classification and admission control are performed at network (Diff-Serv domain) edges, while network core is used to implement the QoS procedures based on packet classes.

The Catalyst 3550 is capable of performing either Diff-Serv boundary device functions (classification, marking & admission control) or Diff-Serv interior device functions (e.g. queueing & scheduling). Catalyst 3550 understands the following types of marking: 802.1p priority bits (available on trunk ports only) and IP ToS byte (IP packets only) – the latter could be interpreted as either DSCP (64 values) or IP Precedence (8 values). While IP Precedence and DSCP values are naturally applicable to IP packets, CoS is the only possible “transit” (inter-switch) marking available to non-IP packets (in the sense 3550 interprets them).

One may ask, why is the difference between IP Precedence and DSCP when the former seems to be a subset of DSCP classes space? Actually it is true – IP Precedence value x numerically maps directly to CSx under DSCP nomenclature. However, the semantics of IPP and DSCP are different. If we interpret packets under the “Precedence” model (which is old, military based QoS model – Internet begun as a military project, remember?) we should treat it differently then if we were interpreting it under the CSx class of Diff-Serv model. This is why IP Precedence and DSCP marking are usually considered separately.

To finish with marking, keep in mind, that IP packet may come in with both CoS (taken from dot1q/ISL header) and DSCP/IP Precedence values set. Therefore, a Catalyst switch needs a way to handle both markings at the same time. (We’ll discuss this feature in more details later). (Side-note: QoS features on 3550 switch are enabled with the global mls qos command. As soon as you entered this command, all classification settings are set to their default values, and in effect all packets traversing the switch will have their CoS and DSCP values set to zero)

Now, the QoS flowchart for the 3550 Catalyst switch looks as follows:

Stage 1 (Ingress): Classify, Police & Mark.

The goal of this stage is to mark a packet - deduce an internal DSCP value for whatever type of packet (IP or non-IP) has arrived. (Note that we may consider “drop action” as a special kind of “marking” too). Internal DSCP is “unified” marking scheme used inside the switch to perform any QoS-related processing. To deduce the internal DSCP value for non-IP packets (e.g. IPv6), which could only be marked using the CoS field, a special configurable (switch-global) CoS to DSCP translation table is used. This table is applied whether we use CoS to classify an incoming packet, be it IP or non-IP - more on that later.

For IP packets, the new internal DSCP value replaces the original DSCP field in a packet. For non-IP packets, new CoS value is calculated for packets from the internal DSCP using a configurable DSCP to CoS table (which is global to a switch). The latter table is also used on the scheduling stage, to determine egress Queue-ID.

Specifically, to classify non-IP packets a Catalyst switch may either:

a) Trust the CoS value from incoming frame on a trunk link (keep the CoS value encoded in a frame). This option is only applicable to trunk links. (You may consider voice VLAN dot1p option as a rudimentary trunk also).

b) Use the default CoS value assigned to a port. This action is used when no CoS value is present in incoming packet and you configure to trust CoS .You may use this option on either access or trunk links (e.g. 802.1q native VLAN). The default “CoS default” value is zero. You may set it with mls qos cos interface-level command.

A few words on “CoS override mode”. When you configure mls qos cos X override under interface, all packets entering the interfaces are assigned this CoS value, irrespective of any other QoS settings (e.g. trust states). Use this command to quickly assign a “precedence” level to all packets entering an interfaces.

c) Classify and assign CoS value based on QoS ACLs. You may configure and apply QoS ACL (actually, it’s just a regular ACL) using an ingress policy-map on 3550 switch. Packets matching the ACL are then either trusted, assigned CoS value manually, and/or policed. For non-IP packets you may only use MAC extended access-lists for classification. Note that due to 3550 “blindness” you may only use MAC ACLs to classify only the non-IP packets – MAC ACLs won’t work with the IP packets!

After the CoS value has been deduced for non-IP packet, it is then translated into internal DSCP value by using the configurable CoS to DSCP table. Here is an example:

!
! CoS -> DSCP mapping table, define 8 DSCP values here (CoS 0..7)
!
mls qos map cos-dscp 16 8 16 24 32 46 48 56

!
! MAC access-list to match ARP (non IP) packets
!
mac access-list extended ARP
permit any any 0x806 0x0
!
class-map ARP
match access-group name ARP

!
! Assign CoS 3 to ARP packets and police
!
policy-map CLASSIFY
class ARP
set cos 3

!
! Attach service-policy ingress
!
interface FastEthernet 0/4
mls qos trust cos
service-policy input CLASSIFY
mls qos cos 1

More classification options are available for IP packets. First, as mentioned above, an IP packet may arrive encapsulated in Ethernet frame with dot1q or ISL header. That means we may obtain the current marking either from CoS or IP Precedence/DSCP bits. Here is a full list:

a) Trust CoS value for an incoming packet. The CoS to DSCP translation table is then used to deduce the internal DSCP value. You may optionally configure special DSCP pass-through mode with the mls qos trust cos pass-through dscp command. When this mode is active, internal DSCP value is deduced from the CoS field, and all QoS operations are based on it. However, the original DSCP value in packet is not getting overwritten with the new DSCP deduced from CoS field. This feature also works backwards: you may trust DSCP value for an IP packet (discussed later) and leave CoS value untouched, passing it transparently.

b) Use the default CoS value assigned to a port. This option only works if you configure to trust CoS and CoS value is not present in the incoming packet.

c) Trust IP Precedence. Takes the IP Precedence field from the IP packet header, and convert it to internal DSCP value using the configurable IP Precedence to DSCP mapping table. This is pretty straightforward.

d) Trust DSCP value. The DSCP value from an IP packet header is retained and used as internal DSCP. As mentioned above, you may disable CoS rewrite according to DSCP to CoS tables by issuing global-mode command mls qos trust dscp pass-through cos.

d) Use QoS ACLs for classification. You may use standard or extended IP access-list to match incoming packets by applying them with MQC class-maps and policy-maps. Matched packets could be trusted, marked and/or policed. Note that due to TCAM capacity limits you may not be able to fit arbitrary large QoS ACL into hardware.

e) IP packets that did not fall under any of the previous classification options are assigned the DSCP value of zero (best-effort traffic).

!
! IP extended ACL to match ICMP traffic
!
ip access-list extended ICMP
permit icmp any any
!
class-map ICMP
match access-group ICMP

!
! TCP traffic
!
ip access-list extended TCP
permit tcp any any
!
class-map TCP
match access-group TCP

policy-map CLASSIFY
!
! Classification combined with policing
!
class ICMP
set ip dscp 26
police 64000 8000
class TCP
trust ip-precedence

!
! Attach service-policy ingress
!
interface FastEthernet 0/4
mls qos trust dscp
service-policy input CLASSIFY
mls qos cos 1

Ingress Policing

Catalyst 3550 switches allow to apply ingress policing to various traffic classes, using DSCP values or IP/MAC ACLs to classify traffic (the latter option is the only one available to non-IP packets). Policing could be combined with marking, by using the MQC policy-maps syntax (set command). You may apply up to 8 ingress policers on any port and up to 8 egress policers. Note, that egress policers support only DSCP-based classification. On GigabitEtherner interfaces number of ingress policers is bigger, but still limited to 128.

A policer in Catalyst 3550 is defined by its average rate, burst size and exceed action. The larger is the burst size, the more tolerable is policer to accident traffic rate “spikes”. No formula exists for optimal burst size - it’s usually calculated basic on empirical results. However, a common recommendation is to configure the burst size to no less than AverageRate*1,5s bytes, since this parameter is suitable for heavy TCP traffic flows (avoids excessive drops and TCP slow-start behavior). Due to the fact that policers are implemented in hardware, you may find that IOS rounds up your burst size to a value more suited for ASIC based implementation.

Exceed actions include drop and markdown (policed DSCP transmit). When you apply a markdown action under a policer, a special global configurable table is used to map an internal DSCP to it’s policed equivalent. E.g. you may want to police down exceeding user traffic from CS0 (e.g. best-effort) to CS1 (e.g. Scavenger). Default markdown settings are to keep DSCP values the same.

Policer could be of two types – individual or aggregate. Individual policer is configured under a class-map assigned under a policy map, and applies only to a single traffic class. Aggregate policers are defined globally, using special command syntax, and shared among multiple classes, configured under a policy map attached to an interface. That is, traffic under all classes configured to share an aggregate policer is policed under the same policer settings. Note that you can’t share an aggregate policer among different ports.

!
! Global markdown configuration
!
mls qos map policed-dscp 0 46 to 1
!
ip access-list extended ICMP_4
permit icmp any host 150.15.4.4
!
ip access-list extended ICMP_44
permit icmp any host 150.15.44.44
!
class-map ICMP_4
match access-group name ICMP_4
!
class-map ICMP_44
match access-group name ICMP_44
!
!
!
mls qos aggregate-policer AGG1 16000 8000 exceed-action policed-dscp

!
! Set the default DSCP to 0, remark to CS1 on exceed
!
policy-map INGRESS_R3
class ICMP_4
set ip dscp 0
police aggregate AGG1
class ICMP_44
set ip dscp 0
police aggregate AGG1

Per-Port Per-VLAN (PPPV) classification

The classification options we considered so far are “port-wise” - i.e. they apply to all traffic arriving on a port, whether it is trunk or access link. However, 3550 switches have special feature called per-por per-VLAN classification, which allows applying a QoS ACL classification method on a per-VLAN basic to a specific port.

PPPV requires a very strict syntax for its configuration. Failing to obey the order of operations will result in non-working and rejected configuration. First, you need to define a class-map to match traffic “inside” a VLAN. This class-map could be based on an IP/MAC ACL or simply match a range of DSCP values. For example:

class-map VOICE_BEARER
match ip dscp ef

Next you create a second, “parent” class-map, that matches a specific VLAN or VLAN range. The first entry under this class-map must be a match vlan statement. The second (and the last entry) should be match class-map entry, that matches traffic “inside” a VLAN or VLAN range:

class-map match-all VLAN_34_VOICE
match vlan 34
match class-map VOICE_BEARER

You then assign the “parent” classes to a policy-map. Note that all classes under this policy-map must contain match vlan as their first entry, and match class-map as their second entry.

class-map SCAVENGER
match ip dscp 1
!
class-map VLAN_43_SCAVENGER
match vlan 43
match class-map SCAVENGER

!
! Per VLAN policy-map
!
policy-map PER_VLAN_POLICY
class VLAN_34_VOICE
police 128000 32000 exceed policed-dscp
class VLAN_43_SCAVENGER
police 64000 16000 exceed drop

interface FastEthernet 0/3
service-policy input PER_VLAN_POLICY

Stage 2 (Egress): Police, Queue & Schedule

Packets arrive on egress with internal DSCP value assigned. Note that security ACL are matched against the original DSCP value, not the one imposed by classification process – this is due to the fact that QoS and Security ACLS are looked up through TCAM in parallel. Therefore VLAN access-map may block your packet based on it’s DSCP original value, no the one assigned by QoS ACLs. So now that packet has arrived to output port, 3550 performs the following:

a) Police traffic stream, based on DSCP value solely. No other classification option exists for egress policers. Note that you are allowed to mix ingress and egress policers on the same interface.

class-map SCAVENGER
match ip dscp 1
!
policy-map POLICE_SCAVENGER
class SCAVENGER
police 64000 8000
!
interface FastEthernet 0/3
service-policy output POLICE_SCAVENGER

b) Assign a packet to an output queue. 3550 defines 4 output queues (IDs from 1 to 4) for each interface, and packets are assigned to a queue using CoS to Queue-ID interface-level mapping table. Since packet handling is based on internal DSCP value, this value should be mapped to a CoS value, using yet another mapping table (this time global), called DSCP to CoS mapping table (there exists a default mapping, of course). Therefore, to assign a packet to an output queue, two lookups are made: DSCP->CoS & CoS->Queue-ID.

!
! DSCP->CoS global mapping table
!
mls qos map dscp-cos 56 4
!
interface FastEthernet 0/3
!
! CoS to Queue-ID mappings
!
wrr-queue cos-map 1 0 1
wrr-queue cos-map 2 2
wrr-queue cos-map 3 3 6 7
wrr-queue cos-map 4 5

Next, a check is performed to see if a selected queue has enough buffer space to accommodate another packet. If no space is available, packet is dropped. Buffer space is allocated to queues in the following manner:

For FastEthernet ports, you may configure up to 8 global levels - each level has a numeric value assigned, representing the number of packets allowed on a queue. You may then assign a level to a queue-id on per-interface basis. This limits the per-port flexibility, but is more optimized for hardware handling. The default assignment of levels to queue-ids is 1-4 to 1-4.

mls qos min-reserve 1 170
!
interface FastEthernet 0/3
wrr-queue min-reserve 1 4
wrr-queue min-reserve 2 3
wrr-queue min-reserve 3 2
wrr-queue min-reserve 4 1

For GigabitEthernet interface, you may configure the queue sizes directly under the interface configuration mode - no global levels need to be referenced. Simply specify how you want to divide the buffer pool of the gigabit port among the queues, by assigning each queue a relative weight value (not an absolute count of packets in queue!).

interface GigabitEthernet 0/1
wrr-queue queue-limit 20 20 20 40

Packet drop procedure is not as simple as it may seems like. First, for FastEthernet interfaces you are only limited to a simple tail-drop, where the last packet not fitting in a queue is dropped. However, 3550 makes this simple algorithm more flexible – you are allowed to configure to start dropping packets before the queue is actually full. For every queue, it is possible to assign two thresholds – in percents of queue size. If the current queue size exceeds the threshold, new packets are dropped. Why two thresholds? To introduce yet another mapping table! Yes, you can map internal DSCP values to queue-size thresholds – within the interfaces scope - for all queue-ids simultaneously.

This way, you may configure to start dropping “unimportant” packets sooner, say when queue is 80% full, and drop important packets only when the queue is absolutely 100% full (remember you have 64 DSCP classes and just 4 queues - you cant guarantee a queue to every class!). This differentiated drop behavior is actually required per the Diff-Serv model, allowing implementing of different drop precedence for different classes.

interface FastEthernet 0/3
!
! wrr-queue threshold queue-id thresh1% thresh2%
!
wrr-queue threshold 1 80 100
wrr-queue threshold 2 80 100
wrr-queue threshold 3 50 100
!
! wrr-queue dscp-map threshold-id dscp1, … , dscp8
!
wrr-queue dscp-map 2 46 56 48
wrr-queue dscp-map 1 0 8 16 24 34

By default, all DSCP values are mapped to threshold 1 and the threshold value is set to 100% of queue size.

For GigabitEthernet intrefaces (uplinks), you may use the same tail-drop procedure, or configure more advanced WRED (Weighted Random Early Detection) as drop policy. Describing WRED in details is beyond the scope of this document, but in short, it allows starting packet drops randomly, before the queue size reaches the configured threshold. This random behavior is specifically designed to overcome TCP over congestion and synchronization problems on loaded links. As with the tail-drop behavior, you may configure two WRED thresholds for each queue, and then map DSCP values to each threshold on per-interface basis. Packet drop probability increases as the queue size reaches the configured threshold.

interface GigabitEthernet 0/1
!
! wrr-queue random max-thresh queue-id thresh1% thresh2%
!
wrr-queue random-detect max-threshold 1 80 100
wrr-queue random-detect max-threshold 2 70 100
wrr-queue random-detect max-threshold 3 70 100
wrr-queue dscp-map 2 46 56 48
wrr-queue dscp-map 1 0 8 16 24 34

c) Schedule/services packets on queue. Now that the packet has been queued, all the interfaces queues are serviced in a round-robin fashion, which is the simplest approximation to the “ideal” GPS (Generalized Processor Sharing) algorithm. The reason to choose such simple scheduler is the need for wire-speed latency of hardware packet switching. You just can’t implement WFQ or CBWFQ effectively for high-density port devices like a L3 switches due to the algorithms higher complexity. However, the scheduler used is not that primitive. Rather, it uses weighted round robin (WRR) to service interface queues. That means, you can assign up to four weights, one to each of the queues, and the scheduler will then service the queues in accordance to their weights.

Say if you assigned weigh values “10 20 30 40” to queues 1, 2, 3 and 4 then every round scheduler will take up to 10 packets from queue 1, no more than 20 packets from queue 2 etc. This way, all queues are services in a very effective manner, which still allows implementing Assured Forwarding behavior per the Diff-Serv model requirements. Note, that WRR does not limit the upper bound of packet rates – it only ensures it queue is services proportional to it’s weight, when interface is heavily congested. Under normal conditions, a queue could claim more bandwidth, than it’s configured by the use of it’s weight. The default weight values are 25 25 25 25. Side Note: The WRR description given here is not the classic WRR algorithm; the latter requires the average packet size for each queue to be known in ahead, in order to adjust weights and provide fairness. Therefore, per the DocCD description, we may assume that scheduling implemented is not the classic WRR, but rather a simplified form of it.

interface FastEthernet 0/3
wrr-queue bandwdith 10 20 30 40

The last component to interface queue scheduler is priority queue. With each interface, you may configure queue-id 4 to be treated as priority. This means, whether you have any packets on queue-id 4, this queue is always empties first, no matter how many packets are there. Only then the regular WRR queues gets services. Expedite (priority) queue is enabled with priority-queue out command at inerface level:

interface FastEthernet 0/3
!
! With PQ enabled, you may assign any weight to queue-id 4
!
wrr-queue bandwdith 10 20 30 1
priority-queue out

Note that when you have priority-queue enabled, you may configure any WRR weigth for this queue-id (4) – it is not taken in account when servicing WRR queues.

Basically, these are all the important details about 3550 QoS. We did not mention things like DSCP mutations maps, and some other minor features. However, the entire core QoS processing has been described and, hopefully, been made clear to a reader.

Further Reading
Understanding QoS Policing and Marking on the Catalyst 3350

Illustration

Consider the following test topology:

Cat3550-QoS

With the next example we will classify three traffic types on a single port: RIP updates, ICMP packets and IPv6 traffic. We will use policy-map to assign the IP precedence and CoS values to the packets accordingly.

SW3:
ip access-list extended RIP
permit udp any any eq 520
!
ip access-list extended ICMP
permit icmp any any

!
! IPv6 Ethertype is 0x86DD
!
mac access-list extended IPV6
permit any any 0x86DD 0x0

class-map RIP
match access-group name RIP
!
class-map ICMP
match access-group name ICMP
!
class-map VOICE
match ip dscp ef
!
class-map IPV6
match access-group name IPV6

!
! Classify and police ingress traffic on the link to R3
!
policy-map CLASSIFY
class RIP
set ip precedence 3
class ICMP
set ip precedence 4
police 32000 8000 exceed drop
class IPV6
set cos 5
police 32000 8000 exceed drop
!
mls qos map cos-dscp 0 8 16 24 32 46 48 56
!
!
interface FastEthernet 0/3
service-policy input CLASSIFY

Verification: Create an access-list on R4 to match the IP packets

R4:
no ip access-list extended MONITOR_34
ip access-list extended MONITOR_34
permit udp any any eq 520 precedence 3
permit icmp any any precedence 4
permit ip any any
!
interface FastEthernet 0/1.34
ip access-group MONITOR_34 in

Send ICMP packet at a rate enough to kick in the policer exceed action:

Rack15R3#ping 155.15.34.4 repeat 50 size 1000
Type escape sequence to abort.
Sending 50, 1000-byte ICMP Echos to 155.15.34.4, timeout is 2 seconds:
!!!!!!!!!.!!!!!!!!.!!!!!!!!.!!!!!!!!.!!!!!!!!.!!!!
Success rate is 90 percent (45/50), round-trip min/avg/max = 4/4/9 ms

And verify the access-list at R4:

Rack15R4#show ip access-lists MONITOR_34
Extended IP access list MONITOR_34
10 permit udp any any eq rip precedence flash (18 matches)
20 permit icmp any any precedence flash-override (135 matches)
30 permit ip any any (9 matches)

To check if IPv6 packets are matched by MAC access-list, send highly saturated stream of ICMPv6 echo packets to R4:

Rack15R3#ping 2001:0:0:34::4 repeat 50 size 1000 timeout 1
Type escape sequence to abort.
Sending 50, 1000-byte ICMP Echos to 2001:0:0:34::4, timeout is 1 seconds:
!!!!!!!!!.!!!!.!!!!.!!!!.!!!!.!!!!.!!!!.!!!!.!!!!.
Success rate is 82 percent (41/50), round-trip min/avg/max = 4/7/8 ms

Subscribe to INE Blog Updates

New Blog Posts!