Sep
12

This blog is focusing on QoS on the PIX/ASA and is based on 7.2 code to be consistent with the CCIE Security Lab Exam as of the date of this post. I will create a later blog regarding new features to 8.X code for all of you non-exam biased readers :-)

NOTE: We have already seen thanks to our readers that some of these features are very model/license dependent! For example, we have yet to find an ASA that allows traffic shaping. 

One of the first things that you discover about QoS for PIX/ASA when you check the documentation is that none of the QoS tools that these devices support are available when you are in multiple context mode. This jumped out at me as a bit strange and I just had to see for myself. Here I went to a PIX device, switched to multiple mode, and then searched for the priority-queue global configuration mode command. Notice that, sure enough, the command was not available in the CUSTA context, or the system context.

pixfirewall# configure terminal
pixfirewall(config)# mode multiple
WARNING: This command will change the behavior of the device
WARNING: This command will initiate a Reboot
Proceed with change mode? [confirm]
Convert the system configuration? [confirm]
pixfirewall> enable
pixfirewall# show mode
Security context mode: multiple
pixfirewall# configure terminal        
pixfirewall(config)# context CUSTA
Creating context 'CUSTA'... Done. (2)
pixfirewall(config-ctx)# context CUSTA
pixfirewall(config-ctx)# config-url flash:/custa.cfg
pixfirewall(config-ctx)# allocate-interface e2 
pixfirewall(config-ctx)# changeto context CUSTA
pixfirewall/CUSTA(config)# pri?     
configure mode commands/options:
privilege
pixfirewall/CUSTA# changeto context system
pixfirewall# conf t
pixfirewall(config)# pr?
configure mode commands/options:
privilege 

OK, so we have no QoS capabilities when in multiple context mode. :-| What QoS capabilities do we possess on the PIX/ASA when we are behaving in single context mode? Here they are:

  • Policing – you will be able to set a “speed limit” for traffic on the PIX/ASA. The policer will discard any packets trying to exceed this rate. I always like to think of the Soup Guy on Seinfeld with this one - "NO BANDWIDTH FOR YOU!" 
  • Shaping – again, this tool allows you to set a speed limit, but it is “kinder and gentler”. This tool will attempt to buffer traffic and send it later should the traffic exceed the shaped rate.
  • Priority Queuing – for traffic (like VoIP that rely hates delays and variable delays (jitter), the PIX/ASA does support priority queuing of that traffic. The documentation refers to this as a Low Latency Queuing (LLQ).

Now before we get too excited about these options for tools, we must understand that we are going to face some pretty big limitations with their usage compared to shaping, policing, and LLQ on a Cisco router. We will detail these limitations in future blogs on the specific tools, but here is an example. We might get very excited when we see LLQ in relation to the PIX/ASA, but it is certainly not the LLQ that we are accustomed to on a router. On a router, LLQ is really Class-Based Weighted Fair Queuing (CBWFQ) with the addition of strict Priority Queuing (PQ). On the PIX/ASA, we are just not going to have that type of granular control over many traffic forms. In fact, with the standard priority queuing approach on the PIX/ASA, there is a single LLQ for your priority traffic and all other traffic falls into a best effort queue.

If you have been around QoS for a while, you are going to be very excited about how we set these mechanisms up on the security appliance. We are going to use the Modular Quality of Service Command Line Interface (MQC) approach! The MQC was invented for CBWFQ on the routers, but now we are seeing it everywhere. In fact, on the security appliance it is termed the Modular Policy Framework. This is because it not only handles QoS configurations, but also traffic inspections (including deep packet inspections), and can be used to configure the Intrusion Prevention and Content Management Security Service Modules. Boy, the ole’ MQC sure has come a long way.

While you might be frustrated with some of the limitations in the individual tools, at least there are a couple of combinations that can feature the tools working together. Specificaly, you can:

  • Use standard priority queueing (for example for voice) and then police for all of the other traffic.
  • You can also use traffic shaping for all traffic in conjunction with hierarchical priority queuing for a subset of traffic. Again, in later blogs we will educate you more fully on each tool.

Thanks for reading and I hope you are looking forward to future blog entries on QoS with the ASA/PIX.

Mar
03

The 3560 QoS processing model is tightly coupled with it’s hardware architecture borrowed from the 3750 series switches. The most notable feature is the internal switch ring, which is used for the switch stacking purpose. Packets entering a 3560/3750 switch are queued and serviced twice: first on the ingress, before they are put on the internal ring, and second on the egress port, where they have been delivered by the internal ring switching. In short, the process looks as follows:

[Classify/Police/Mark] -> [Ingress Queues] -> [Internal Ring] -> [Egress Queues]

For more insights and detailed overview of StackWise technology used by the 3750 models, visit the following link:

Cisco StackWise Technology White Paper

Next, it should be noted that the 3560 model is capable of recognizing and processing IPv6 pacekts natively – this feature affects some classification options. Another big difference is the absence of internal DSCP value and the use of QoS label for internal packet marking. This feature allows the 3560 switches to provide different classes of services to CoS or DSCP marked packets, by allowing them to be mapped to different queues/thresholds etc. Many other concepts and commands are different as well, as is some nomenclature (e.g. the number of the priority queue). We will try to summarize and analyze the differences under the following paragraphs. The biggest obstacle is absence of any good information source on the 3750/3560 switches QoS, besides the Cisco Documentation site, which has really poor documents regarding both models.

1. Queue Scheduler: SRR

One significant change is replacement of WRR scheduler with SRR, where latter stands for either Shared Round Robin or Shaped Round Robin – which are two different modes of operation for the scheduler. As we remember, what WRR does is simply de-queue the number of packets proportional to the weight assigned to a queue walking through queues in round-robin fashion. SRR performs similar in shared more: each output queue has a weight value assigned, and is serviced in proportion to the assigned weight when interface is under congestion.

The implementation details are not documented to open public by Cisco; however, so far we know that Shared Round Robin tries to behave more “fairly” than WRR does. While on every scheduler round WRR empties each queue up to the maximum number of allowed packet, before switching to the next queue, SRR performs a series of quick runs each round, deciding on whether to de-queue a packet from the particular queue (based on queue weights). In effect, SRR achieves more “fairness” on per-round basis, because it does not take the whole allowed amount each time it visits a queue. On the “long run” SRR and WRR behave similarly.

The shaped mode of SRR is something not available with WRR at all. With this mode, each queue has weight that determines the maximum allowed transmission rate for a queue. That is, interface bandwidth is divided in proportions of queue weights, and each queue is not allowed to send packets above this “slice” of common bandwidth. The details of the implementation are not provided, so we can only assume it’s some kind of effective policing strategy. Pay special attention, that SRR weight values are *absolute*, not relative. That is, the proportion of interface bandwidth give to a particular queue is "1/weight*{interface speed}. So for a weight of "20" the limit is "1/20*Speed" and equals to 5Mbps with 100Mbps interface. Also, by default, queue 1 has SRR shaped weight 1/25, so take care when you turn MLS QoS on.

An interface could be configured for SRR shared and shaped mode at the same time. However, SRR shaped mode always take preference over SRR shared weights. Finally, SRR provides support for priority queue but this queue is not subject to SRR shaping limits as well. The weight assigned to the priority queue is simply ignored for SRR calculations. Note that unlike the 3550, on the 3560, egress priority queue has queue-id 1, not 4. Here is an example:

!
! By settings shape weight to zero we effectively disable shaping for the particular queue
!
interface gigabitethernet0/1
srr-queue bandwidth shape 10 0 0 0
srr-queue bandwidth share 10 10 60 20
!
! As expedite queue (queue-id 1) is enabled, it’s weight is no longer honored by the SRR scheduler
!
priority-queue out

Another interesting egress feature is port rate-limit. SRR could be configured to limit the total port bandwidth to a percentage of physical interface speed – from 10% to 90%. Don’t configure this feature if there is no need to limit overall port bandwidth.

!
! Limit the available egress bandwidth to 80% of interface speed
!
interface gigabitethernet0/1
srr-queue bandwidth limit 80

Note that the resulting port rate depends on the physical speed of the port - 10/100/1000Mbps.

2. Egress Queues

The 3550 has four egress queues identified by their numbers 1-4 available to every interface,. Queue number 4 could be configured as expedite. On the 3560 side, we have the same four egress queues, but this time for expedite services we could configure queue-id 1.

With the 3550, for “class-to-queue” mapping only CoS to queue-id table is available on a per-port basis. Globally configurable DSCP to CoS mapping table is used to map an internal DSCP value to the equivalent CoS. As for the 3560 model, DSCP and CoS values are mapped to queue-ids separately. That means IP and non-IP traffic could be mapped/serviced separately. What if an IP packet comes with DSCP and CoS values both set? Then the switch will use the marking used for classification (e.g. CoS if trust cos or set cos were used) to assign the packet to a queue/threshold.

The 3550 supports WTD and WRED as queue drop strategy (the latter option available on Gigabit ports only). The 3560 model supports WTD as the only drop strategy, allowing for three per-queue drop thresholds. Only two of the thresholds are configurable – called explicit drop thresholds - and the third one is fixed to mark the full queue state (implicit threshold).

Finally, the mentioned mappings are configured for queue-id and drop threshold simultaneously in global configuration mode - unlike the 3550 where you configured CoS to Queue-ID and DSCP to Drop-Threshold mappings separately (and on per-interface basis).

!
! CoS values are mapped to 4 queues. Remember queue-id 1 could be set as expedite
!

!
! The next entry maps CoS value 5 to queue 1 and threshold 3 (100%)
!
mls qos srr-queue output cos-map queue 1 threshold 3 5
!
! VoIP signaling and network management traffic go to queue 2
!
mls qos srr-queue output cos-map queue 2 threshold 3 3 6 7
mls qos srr-queue output cos-map queue 3 threshold 3 2 4
mls qos srr-queue output cos-map queue 4 threshold 2 1
mls qos srr-queue output cos-map queue 4 threshold 3 0

!
! DSCP to queue/threshold mappings
!
mls qos srr-queue output dscp-map queue 1 threshold 3 40 41 42 43 44 45 46 47

mls qos srr-queue output dscp-map queue 2 threshold 3 24 25 26 27 28 29 30 31
mls qos srr-queue output dscp-map queue 2 threshold 3 48 49 50 51 52 53 54 55
mls qos srr-queue output dscp-map queue 2 threshold 3 56 57 58 59 60 61 62 63

mls qos srr-queue output dscp-map queue 3 threshold 3 16 17 18 19 20 21 22 23
mls qos srr-queue output dscp-map queue 3 threshold 3 32 33 34 35 36 37 38 39

!
! DSCP 8 is CS1 – Scavenger class, mapped to the first threshold of the last queue
!
mls qos srr-queue output dscp-map queue 4 threshold 1 8
mls qos srr-queue output dscp-map queue 4 threshold 2 9 10 11 12 13 14 15
mls qos srr-queue output dscp-map queue 4 threshold 3 0 1 2 3 4 5 6 7

Next, what we need to know is how to configure egress queue buffer spaces along with the threshold settings. With the 3550 we had two options: first, globally configurable buffer levels for FastEthernet ports, assigned per interface; second, one shared buffer pool for each of gigabit ports, with proportions (weights) configured on per-interface level too. With the 3560 things have been changed. Firstly, all ports are now symmetrical in their configurations. Secondly, a concept of queue-set has been introduced. A queue-set defines buffer space partition scheme as well as threshold levels for each of four queues. Only two queue-set are available in the system, and are configured globally. After a queue-set has been defined/redefined it could be applied to an interface. Of course, default queue-set configurations exist as well.

As usual, queue thresholds are defined in percentage of the allocated queue memory (buffer space). However, the 3560 switch introduced another buffer pooling model. There exist two buffer pools – reserved and common. Each interface has some buffer space allocated under the reserved pool. This reserved buffer space, allocated to an interface, could be partitioned between egress interface queues, by assigning a weight value (this resembles the gigabit interface buffer partitioning with the 3550):

!
! Queue-sets have different reserved pool partitioning schemes
! Each of four queues is given a weight value to allocate some reserved buffer space
!
mls qos queue-set output 1 buffers 10 10 26 54
mls qos queue-set output 2 buffers 16 6 17 61
!
interface gigabitEthernet 0/1
queue-set 1

What about the common buffer pool? Let’s take a look at the threshold setting command first:

mls qos queue-set output qset-id threshold queue-id drop-threshold1 drop-threshold2 reserved-threshold maximum-threshold

We see two explicitly configured thresholds 1 and 2 – just as with the 3550. However, there is one special threshold called reserved-threshold. What it does, is specifies how much of reserved buffer space is allocated to a queue – i.e. how much of reserved buffers are actually “reserved”. As we know, with the 3560 model every queue on every port has some memory allocated under reserved buffer pool. The reserved-threshold tells how much of this memory to allocate to a queue – from 1 to 100%. The unused amount of the reserved buffer space becomes available to over queues under the common buffer pool mentioned above. The common buffer pool could be used by any queue to borrow buffers above the queue’s reserved space. That allows to set drop-thresholds to values greater than 100%, meaning it’s allowable for queue to take more credit from common pool to satisfy it needs. The maximum-threshold specifies the “borrowing limit” – how much a queue is allowed to grow into the common pool.

Look at the command “mls qos queue-set output 1 threshold 1 138 138 92 138”. It says we shrink reserved buffer space for queue 1 to 92% sending the exceeding space to common pool. All the three drop-thresholds are set to 138% of the queue buffer space (allocated by buffer command), meaning we allow the queue to borrow from the common pool up to the levels specified. Drop thresholds may be set as large as 400% of the configured queue size.

Now we see that this model is a bit more complicated than it was with the 3550. We don’t know the actual sizes of reserved buffer pool, but we are allowed to specify the relative importance of each queue. Additionally, we may give up some reserved buffer space out to a common buffer pool to share with the other queues. Here is a detailed example from AutoQoS settings:

!
! Set thresholds for all four queues in queue-set 1
!
mls qos queue-set output 1 threshold 1 138 138 92 138
mls qos queue-set output 1 threshold 2 138 138 92 400
mls qos queue-set output 1 threshold 3 36 77 100 318
mls qos queue-set output 1 threshold 4 20 50 67 400

!
! Set thresholds for all four queues in queue-set 2
!
mls qos queue-set output 2 threshold 1 149 149 100 149
mls qos queue-set output 2 threshold 2 118 118 100 235
mls qos queue-set output 2 threshold 3 41 68 100 272
mls qos queue-set output 2 threshold 4 42 72 100 242

interface gigabitEthernet 0/1
queue-set 1
!
interface gigabitEthernet 0/2
queue-set 2

An finally, if you are not sure what you’re doing, don’t play with thresholds and buffer space partitioning – leave the to the default values, or use AutoQoS recommendations. So says the DocCD!

3. Ingress Queues

The unique feature of the 3560 switches is possibility to configure two ingress queues on every port. Of these queues, you may configure any to be an expedite queue, with queue 2 being the default. The priority queue is guaranteed access to the internal ring when the ring is congested. SRR only serves the ingress queues in shared mode. All the weights are configured globally, not per-interface as it was with the egress queues:

!
! Disable the default priority queue and share the bandwidth on the ring in 1/4 proportion
!
mls qos srr-queue input priority-queue 2 bandwidth 0
mls qos srr-queue input bandwidth 25 75

The UniverCD has a very shallow description of how actually SRR algorithm serves the two queues, when one of them is configured as expedite. So far, it looks like that you should configure one queue as expedite, assign the priority bandwidth value to it (from 1 to 40% of the internal ring bandwidth), and also assign the bandwidth values to both of the ingress queues as usual:

mls qos srr-queue input priority-queue 1 bandwidth 10
mls qos srr-queue input bandwidth 50 50

What’s that supposed to mean? Seems like queue 1 is partially serviced as an expedite queue, up the limit set by priority-queue bandwidth. As this counter exhausts, it is then being served on par with the non-expedite queue, using the bandwidth weights assigned. With the example about that mean we have 10% of bandwidth dedicated to priority queue services. As soon as this counter exhausts (say on a per-round basis) SRR continues to service both ingress queues using the remaining bandwidth counter (90%) shared in the proportions of the weights assigned (50 & 50) – that means 45% of the ring bandwidth to each of the queues. Overall, it looks like SRR simulates a policer for priority queue, but rather than dropping the traffic it simply changes the scheduling mode, until enough credits are accumulated to start expedite services again. Now go figures how to use that in real life! Too bad Cisco does no give out to public any internal details on their SRR implementation.

Now, two things remain to consider for the ingress queues: class mappings to the queues, and buffer/threshold settings. The class mapping uses the same syntax as for the egress queues, allowing configuring global mapping of CoS and DSCP do queue-ids and thresholds.

!
! CoS to queue-id/threshold
!
mls qos srr-queue input cos-map queue 1 threshold 3 0
mls qos srr-queue input cos-map queue 1 threshold 2 1
mls qos srr-queue input cos-map queue 2 threshold 1 2
mls qos srr-queue input cos-map queue 2 threshold 2 4 6 7
mls qos srr-queue input cos-map queue 2 threshold 3 3 5

!
! DSCP to queue-id/threshold
!
mls qos srr-queue input dscp-map queue 1 threshold 2 9 10 11 12 13 14 15
mls qos srr-queue input dscp-map queue 1 threshold 3 0 1 2 3 4 5 6 7
mls qos srr-queue input dscp-map queue 1 threshold 3 32
mls qos srr-queue input dscp-map queue 2 threshold 1 16 17 18 19 20 21 22 23
mls qos srr-queue input dscp-map queue 2 threshold 2 33 34 35 36 37 38 39 48
mls qos srr-queue input dscp-map queue 2 threshold 2 49 50 51 52 53 54 55 56
mls qos srr-queue input dscp-map queue 2 threshold 2 57 58 59 60 61 62 63
mls qos srr-queue input dscp-map queue 2 threshold 3 24 25 26 27 28 29 30 31
mls qos srr-queue input dscp-map queue 2 threshold 3 40 41 42 43 44 45 46 47

As it seems reasonable, the classification option (CoS, DSCP) used determines that mapping table for ingress queue.

The buffer partitioning is pretty simple – there is no common pool, and you only specify the relative weight for every queue. Thresholds are also simplified – you configure just two too of them, in percentage of queue size. The third threshold is implicit, as usual.

!
! Thresholds are set to 8% and 16% for queue 1; 34% and 66% for queue-2
! Buffers are partitioned in 67/33 proportion
!
mls qos srr-queue input threshold 1 8 16
mls qos srr-queue input threshold 2 34 66
mls qos srr-queue input buffers 67 33

So far we covered enough for a single post. In the next post we will discuss how classification policing and marking techniques differ between the 3550 and 3560 models.

Jan
25

Sometimes people need to conditionally advertise routes into BGP table based on time of day. Say, we may want to adversite IGP prefix 150.1.1.0/24 with community 1:100 during daytime and with community 1:200 at the other time. Back in days, the procedure was easy - you had to create time based ACL, and use it in route-map to set communities:

time-range DAY
periodic daily 9:00 to 18:00

access-list 101 permit ip any any time-range DAY

route-map SET_COMMUNITY 10
match ip address 101
set community 1:100
!
route-map SET_COMMUNITY 20
set community 1:200

This construct worked fine back in days with 12.2T and 12.3 IOSes up to 12.3(17)T. However, since 12.3(17)T, BGP scanner behavior has changed significally. Up to the new version, redistribution into BGP table was based on BGP scanner periodically polling the IGP routes every scan-interval (one minute by default). With the new IOS code, redistribution is purely event driven: a new route is added/deleted from BGP table based on event, signaled by IGP (e.g. IGP route withdrawn, next-hop change etc). This change in BGP scanner behavior was not clearly documented, unlike the related BGP support for next-hop address tracking feature. Ovbsiously, a change in time-range is not treated as an IGP event, hence the filter does not work anymore.

Still, there is a number of workarounds. Here is one of them: we use time-based ACL to filter or permit ICMP packets, and advertise routers based on that virtual "reachability" info.

First, we create time-range and time-based access-list:

time-range DAY
periodic daily 9:00 to 18:00
!
access-list 101 permit ip any any time-range DAY

Next we create a special loopback interface, which is used send ICMP echo packets to "ourself" and attach the ACL to the interface to filter incoming (looped back) packets:

interface Loopback0
ip address 150.1.1.1 255.255.255.255
ip access-group 101 in

We create a new IP SLA monitor, to send ICMP echo packets over loopback interface. If the time-based ACL permit pings, the monitor state will be "reachable"

ip sla monitor 1
type echo protocol ipIcmpEcho 150.1.1.1
timeout 100
frequency 1

Next we track our "pinger" state. The first tracker is "on" when the loopback is "open" by packet filter, the second one is active when the time-based ACL filters packets:

track 1 rtr 1 reachability
!
! Inverse logic
!
track 2 list boolean and
object 1 not

The we create two static routes, bound to the mentioned trackets. That is, the static route with tag 100 is only active when loopback is "open", i.e. time-based ACL permits packets. The other static route is active only when time-range is inactive (the second tracker tells that the destination is "reachable"):

ip route 150.1.1.0 255.255.255.0 Loopback0 150.1.1.254 tag 100 track 1
ip route 150.1.1.0 255.255.255.0 Loopback0 150.1.1.253 tag 200 track 2

Now we redistribute static routes into BGP, based on tag values, and also set communities based on the tags:

router bgp 1
redistribute static route-map STATIC_TO_BGP
!
route-map STATIC_TO_BGP permit 10
match tag 100
set community 1:100
!
route-map STATIC_TO_BGP permit 20
match tag 200
set community 1:200

This is also a funny example of how you can tie up together multiple IOS features at the same time.

Subscribe to INE Blog Updates