Sep
22

You have just been given a shiny, new router to configure.  As part of the configuration, you are asked to configure an outbound access list which will only permit traffic through to specific destinations.  Here are the requirements that you are given for your access-list:

Match (and permit) the following destinations using an access-list.  Your access list should use the fewest number of lines, and should not overlap any other address space.

Anything within the 10.0.0.0/8 address space.
Anything within the 172.16.0.0/12 address space.
Anything within the 192.168.0.0/16 address space.
Anything within the 169.254.0.0/16 address space.

Be warned, it is estimated that a very high percentage of readers will NOT have the correct answer.

access-list 199 permit ip any object-group TEST

What just happened here?  Can you really match those in a single line?  The answer deals with object groups, which allow grouping of other items.  The object group still needs to be configured, but the question just asked for a short access list.

You can enter in either /x notation for mask, or with subnet mask information, as shown in the following examples:

object-group network TEST
10.0.0.0 /8
172.16.0.0 /12
192.168.0.0 /16
169.254.0.0 /16

The router will convert syntax, and the following will be what remains in your config for the group:

object-group network TEST
10.0.0.0 255.0.0.0
172.16.0.0 255.240.0.0
192.168.0.0 255.255.0.0
169.254.0.0 255.255.0.0

You can also nest object groups.  You could configure the individual groups as follows:

object-group network A
10.0.0.0 /8
object-group network B
172.16.0.0 /12
object-group network C
192.168.0.0 /16

object-group network RFC1918
group-object A
group-object B
group-object C

object-group network APIPA
169.254.0.0 /16

object-group network TEST
group-object RFC1918
group-object APIPA

Here, we took a brief look at network object groups.  Object groups on the router also have a "service" option, which can be used to group protocols and ports. For those of you with a background configuring PIX / ASA, you may already be very familiar with configuring object groups.  For the rest of you, it may be something that you want to practice before your next scheduled lab date.

For more reading:
Cisco - Object Groups for ACLs

Object groups were added in 12.4(20)T.

Aug
17

Intro

There was a lot of blogging related to OSPF topics recently. In this post, I would like to clarify some common misunderstandings that many people have about OSPF route filtering. I have seen so many folks wrongfully understanding the underlying behavior so it's about time to make the things clear.

OSPF Data Structures

To begin with, avoid using the term “LSA filtering” with OSPF. You cannot really filter LSAs - with the exception of one special case - you filter the network reachability information. To understand this in depth, start by recalling that OSPF deals with the following data structures:

1) Topological information. Outlines the connections in the graph describing the routers and the links in the network. This is what OSPF LSAs are about - they contain information about attached links. Think of LSAs as the objects that correspond to the “edges” of the graph. LSAs are stored in the LSBD - link state database. No real “routes” are stored in the LSDB, since this is the database for topological objects. However, routing or network reachability information is attached to the LSAs.

2) Network Reachability information. Contains the actual IP subnets. This information is “associated” with the network graph “edges” and you may think of it as “leaves” connected to the edges. Routing information does NOT describe any connectivity, just the prefix associated with the link. This information is contained in the LSAs, but as an “attribute”, and is used to populate the routing table - i.e. the RIB.

3) Main routing table. This is the router's RIB. This structure is used when OSPF generates new summary or external LSAs as we see later.

4) Routers routing table. This structure is unique to OSPF, and contains the IP addresses to reach the “border” routers - ASBRs and ABRs. It is used when calculating the respective inter-area routes, by adding the router path cost to the respective prefix advertised by this router. You may display the contents of this data structure by using the command show ip ospf border-routers.

OSPF route calculation overview

1) Routers establish adjacencies to flood topological information. The flooding process in OSPF is pretty complicated, and ensures the LSAs are delivered to all routers in a single area. As mentioned, topological information is carried in the form of LSAs and cannot be filtered, which it is essential to the OSPF algorithm. The only thing that limits LSA propagation is the flooding domain associated with the particular LSA type. Using the topological information learned, all routers within an area build the consistent graph of the network connections.

2) After all routers have a consistent topology view, they may calculate intra-area paths using the SPF algorithm and finally associate the network reachability information with the paths. This is where the “secondary”, leaf-level information comes in play. The leaves are attached to the paths and the routing table entries are calculated. So keep this in mind - first LSA flooding, then LSDB population, then SPF computations, and finally the RIB population.

3) After the intra-area paths have been calculated, inter-area routes are computed based on type-3/4 LSAs contents for other area information summarization. This process uses a quick and simple distance-vector computation algorithm, without the need for SPF computations. The router's routing table is used extensively during this process.

When you can REALLY filter LSAs

Now, back to the case of LSA filtering. Some may recall the commands ip ospf database-filter all out or neighbor <IP> database-filter all out. Good point, but what it does is prevent the OSPF process from sending any LSAs out of the particular adjacencies. This could be done only if you're sure that the LSAs will be flooded to all routers in the area by some other means. This is the special case I've been talking about previously. The purpose of this feature is to compensate for the absence of “mesh-groups” and limit the amount of flooding on NBMA subnets shared by many routers.

Overview of OSPF route-filtering

All right, so if you cannot really filter LSAs, how do you perform “route filtering” with OSPF? Using the term “route filtering” is the correct way of saying “LSA filtering”. You may apply route filtering to OSPF using the following two general methods:

1) Preventing optimal paths generated by OSPF from entering the RIB. This is what you can do by applying the command distribute-list in under the OSPF process. This affects routes installed in the RIB. There is one special extension of this method to filtering the FA (forwarding address) to block external OSPF routes.

2) Affecting the LSA generation process, by changing the “source” information used to generate the LSAs. There is a bunch of ways to do this, each specific to a particular LSA type. To understand this completely, we need to recall all the basic LSA types and their flooding scopes.

How OSPF generates different LSA types

LSA type 1. Describes an attached circuit, different link types. This is the fundamental building block of the topology graph. This LSA is flooded within the single area and never gets past the ABRs. The sources of information for LSA type-1 are the directly connected links. If you want to remove the network reachability information, just remove the link :)

LSA type 2. Generated only on the “shared” networks (BROADCAST/NON-BROADCAST network types) to minimize the amount of topological information generated. Imagine that you have 10 routers on a shared Ethernet segment. Normally, to fully describe the topology, every router would need to generate an LSA describing its connection to all other 9 routers. This would result in 90 LSAs. Using LSA type-2 and the Designated Router concept, every router needs to declare a connection to the DR, and the DR will generate a Type 2 LSA describing all routers on the segment (the “bunch”). In our example, this will result in 11 LSAs total. This LSA does not carry any real “network reachability” information with the exception of the netmask and the list of routers on the segment. It is used along with information from type 1 LSAs to describe the shared network. I like to think of it as a "glue" LSA. The flooding domain is one area as flooding stops at ABRs.

LSA type 3. Now this type is a bit tricky and brings in a lot of confusion. It is generated by an ABR to tell the routers in one area about the network in another area. Essentially, the router “pretends” like all the “foreign” networks are attached to it. From a topological perspective, this is true, because areas never know anything about another area's topology - this information is lost when crossing the area boundaries. How are Type 3 LSAs generated? First of all, keep in mind that OSPF generates those by walking the main routing table, not the LSDB. This is per RFC 2328 clause 12.4.3 and in full accordance with distance-vector protocol behavior. Every route in the table has additional OSPF information associated with it, such as area number, route-type (intra-area, inter-area, external) next hop, and so on.

1) The ABR goes over the network reachability information in the RIB associated with intra-are routes for the particular area X and summarizes them honoring the area X range command settings. This results in Type-3 LSAs being generated and advertised into all other areas. Pay attention to the following important things:

1.1) Only intra-area routes are summarized. You cannot summarize inter-ara routes installed by processing type-3 LSAs learned from Area 0. Those will generate new type-3 LSAs in the ABR and will propagated them into non-backbone areas unmodified.

1.2) The intra-area routes are summarized PRIOR to applying the distribute-list filter and blocking the routes from entering the RIB. This is needed to allow for generation of a summary route, even if you don't want the specific prefixes in the local RIB and calculate the correct metric if needed. Thus, even though OSPF walks over the RIB to gather the intra-area prefixes for summarization, it does so BEFORE applying the filter. The ultimate goal is making summarization the highest priority task, in order to increase network stability.

1.3) The OSPF metric for the summarized route is taken as the minimal among all intra-area routes. To ensure better routing stability, it is usually recommended setting the metric manually, to prevent LSA re-flooding in case some component route flaps and affects the summary metric.
2) Now, for dealing with the inter-area routes learned by the ARB, first of all, keep in mind that an ABR ONLY accepts and processes type-3 LSAs received from the backbone area. This is the well-know loop prevention mechanism built into OSPF, since OSPF behaves as a distance-vector protocol when dealing with inter-area routing information. This is a short description of how an ABR processes type-3 LSAs:

2.1) Ignore the type-3 LSA if it is NOT from the backbone area (prevents routing loops).
2.2) Walk over the inter-area routes learned via Area 0 in the RIB and generate respective type-3 LSAs which are flooded into the attached non-backbone areas. Thus, LSAs are effectively being re-generated based on the RIB contents.

Next, consider an important aspect of this process. The re-generated summary LSAs are generated AFTER applying the OSPF filter associated with the routing-process via the distribute-list in command. Thus, if you filter some of the inter-area routes from entering the RIB, the respective new summary LSAs will NOT get generated. This will stop routing information propagation into the attached non-backbone areas.

This quickly summarizes the type-3 LSA generation process. Notice that filtering the routing information is not based on some “LSA” filtering procedure, but rather on the routes contained in the RIB. At this moment, some people may recall the command area X filter-list prefix {in|out}. Good news here - this command applies after all summarization has been done and filters the routing information from being used for type-3 LSA generation. It applies to all three type of prefixes: intra-area routes, inter-area routes, and summaries generated as a result of the area X range command. All information is being learned from the router's RIB.

Oh, almost forgot - LSA type-3 flooding scope is one area, it never crosses ABR boundaries - it just gets re-generated when needed! Now, here is a summary of inter-area router filtering commands (applicable only at an ABR):

Method 1: Filter the inter-area routes generated at ABR
router ospf 1
area 10 filter-list prefix in NAME

Method 2: Filter out intra-area routes
router ospf 1
area 10 range 1.1.1.0 255.255.255.0 no-advertise

Method 3: Filter inter-area routes learned by ABR from Area 0
router ospf 1
distribute-list 1 in

LSA type 4

This type has always been confusing to many people. This LSA describes the metric that the ABR uses to reach the respective ASBR. This LSA contains the router-ID of the ASBR and the metric to reach it. ABRs generate type-4 LSAs based on the special “router routing” table which is visible when you issue the command show ip ospf border-routers. This command is the essense of the distance-vector OSPF behavior. During the inter-area path calculations, the ABR populates this table with “host” routing entries for every ABR and ASBR detected with the respective metrics. This table is never transferred to the main router routing table, but rather used for inter-area path computations and type-4 LSA generation. Effectively, the metrics in this table are used as metric offsets for the paths learned from ABRs and ASBRs.

The ABR generates type-4 summary LSAs into every normal attached area, to make sure the routers in there can reach the prefixes from the ASBR. You cannot really filter the contents of this LSA, as they are taken from the router routing table. The information from this LSA is used to populate the non-ABR routers “special” router routing table. The flooding domain is one area as it stops at the ABR.

LSA type 5

This LSA is originated by an ASBR (router redistributing external routes) and flooded through the whole OSPF autonomous system. You cannot limit the way this LSA is generated except to controlling the routes redistributed into OSPF. When a type-5 LSA is being generated, it uses the RIB contents and honors the summary-address commands in the ABR. You may filter the redistributed routes by using the command distribute-list out configured under the protocol, which is the source of redistribution or simply applying filtering with your redistribution.

Method 1:
router ospf 1
distribute-list 10 out rip
!
access-list 1 deny 1.1.1.0
access-list 1 permit any

Method 2:
router ospf
redistribute rip route-map RIP_TO_OSPF
!
route-map RIP_TO_OSPF
match ip address 1

Method 3:
router ospf
summary-address 10.0.0.0 255.255.25.0 no advertise

There is yet one more way to filter the routing information found in type-5 LSAs. If the LSA contains non-zero “FA” (forwarding address) field, OSPF process will check for this address to be accessible via RIB (RFC specifies that only OSPF routes should be considered, but it seems IOS satisfies with any route) before installing the actual prefix into the RIB. If the FA is not accessible, the corresponding external prefix is not installed into the global routing table. We will discuss this type of filtering a bit later, when we’ll be looking into type-7 LSAs. However, keep in mind that FA is non mandatory for type-5 LSA and is only assigned to a type-5 LSA under special conditions, outlined in the following document Common Routing Problem with OSPF Forwarding Address. Here is the list of the conditions:

  • OSPF is enabled on the ASBR's next hop interface AND
  • ASBR's next hop interface is non-passive under OSPF AND
  • ASBR's next hop interface is not point-to-point AND
  • ASBR's next hop interface is not point-to-multipoint AND
  • ASBR's next hop interface address falls under the network range specified in the router ospf command.

Please refer to the URL provided for more epxplanations.

There is only one more type of OSPF LSAs to discuss.

Type 7 LSA

These only exist at NSSA areas and have the flooding scope of a single area, as opposed to the whole domain for type-5 LSAs. The type-7 LSAs reaching ABRs are used to populate the local routing table and re-generate the new type-5 LSAs, originated now by the ABR. This is important - since the ABR becomes an ASBR and re-originates the routes, you may use the command summary-address ADDR MASK no-advertise to block the type-5 LSA generation. There is another, less obvious way to do things:

When an ABR generates type-5 LSAs, it adds the forwarding-address (FA) based on the information learned in the type-7 LSA. This information is originally inserted by the ASBR to help in optimal exit point selection. The use of forwarding-address with the type-7 LSAs is mandatory per the RFC, since there is just one translator, and it may lie on the sub-optimal path to the ASBR. Thus all routers in the OSPF autonomous system are supposed to rely on the FA for optimal routing to the translated prefixes. And here is the trick: if you filter the forward-address IP from the routing table in the ABR, using the command distribute-list in the type-5 LSA will not be generated!. Look at the following output, where R2 is an ABR for NSSA area 27:

Rack1R2#show ip ospf database nssa-external 

OSPF Router with ID (150.1.2.2) (Process ID 1)

Type-7 AS External Link States (Area 27)

Routing Bit Set on this LSA
LS age: 11
Options: (No TOS-capability, Type 7/5 translation, DC)
LS Type: AS External Link
Link State ID: 162.1.7.0 (External Network Number )
Advertising Router: 162.1.7.7
LS Seq Number: 80000001
Checksum: 0xDEB5
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

Routing Bit Set on this LSA
LS age: 11
Options: (No TOS-capability, Type 7/5 translation, DC)
LS Type: AS External Link
Link State ID: 162.1.10.0 (External Network Number )
Advertising Router: 162.1.7.7
LS Seq Number: 80000001
Checksum: 0xBDD3
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

Rack1R2#show ip ospf database external

OSPF Router with ID (150.1.2.2) (Process ID 1)

Type-5 AS External Link States

...
LS age: 26
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 162.1.7.0 (External Network Number )
Advertising Router: 150.1.2.2
LS Seq Number: 80000001
Checksum: 0x2193
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

LS age: 26
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 162.1.10.0 (External Network Number )
Advertising Router: 150.1.2.2
LS Seq Number: 80000001
Checksum: 0xFFB1
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

As you can see, R2 generates type-5 LSA with the same forwarding address found in type-7 LSA - “150.1.7.7”. Now we filter this IP address from entering R2's routing table:

R2:
access-list 1 deny 150.1.7.7
access-list 1 permit any
!
router ospf 1
distribute-list 1 in

And apply our verification once again:

Rack1R2#show ip ospf database nssa-external 

OSPF Router with ID (150.1.2.2) (Process ID 1)

Type-7 AS External Link States (Area 27)

LS age: 222
Options: (No TOS-capability, Type 7/5 translation, DC)
LS Type: AS External Link
Link State ID: 162.1.7.0 (External Network Number )
Advertising Router: 162.1.7.7
LS Seq Number: 80000001
Checksum: 0xDEB5
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

LS age: 222
Options: (No TOS-capability, Type 7/5 translation, DC)
LS Type: AS External Link
Link State ID: 162.1.10.0 (External Network Number )
Advertising Router: 162.1.7.7
LS Seq Number: 80000001
Checksum: 0xBDD3
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 20
Forward Address: 150.1.7.7
External Route Tag: 0

Rack1R2#show ip ospf database external

OSPF Router with ID (150.1.2.2) (Process ID 1)

Type-5 AS External Link States

And we can see that Type-7 to Type-5 translation is not working anymore, as the forwarding address is no longer reachable in the RIB. Notice that forwarding address should be accessible via the main RIB, not the routers routing table of OSPF. This special behavior is unique to the routes learned by processing the type-7 LSAs. Now what if you have to following topology:

OSPF-Filtering-NSSA

And R3 is filtering the forwarding address for the type-5 LSAs originated at R4 using say area X range no-advertise or area X filer-list prefix {in|out} commands, so that R1 has no FA IP in its RIB. In this situation, R3 will have the route to the redistributed prefixes installed (it sees the FA!), but all other routers in the domain with the exception of the NSSA area internal routers will not. Even though they will receive the type-5 LSA, they will not be able to process them and use the information for routing - the forwarding address will be unreachable. One way to overcome this issue is by using the command area X nssa suppress-fa to instruct R3 on setting the FA to itself.

Summary of the post

We went over almost all of the important route-filtering scenarios for OSPF. The key thing you should remember is that non-local route filtering for OSPF is only available at ABRs and ASBRs, the points where OSPF behaves as a distance-vector protocol with respect to inter-area routing information. Here is the list of points you need to remember:

1) You cannot filter LSAs directly, you can only manipulate the routing information used to generate LSAs.
2) The above routing information is taken from local router's RIB directly.
2.1) In the case of intra-area routes, RIB filters are applied after the type-3 LSAs are generated (intra-area routes are summarizable).
2.2) In the case of inter-area routes, RIB filters are applied prior to type-3 LSA generation (inter-area routes are not summarizable).
3) External AS routes can only be filtered at the ASBR.
4) NSSA routes can be filtered at an ASBR and the ABR performing the translation.
5) If the FA for an external prefix is NOT reachable, the router will NOT install it into the route table nor will it translate type-7 LSAs to type-5.

Futher Reading

1) RFC 2328. Dont skip this if you are serious about understanding OSPF :)
2) OSPF Design Guide by Sam Halabi. Excellent introductory reading on OSPF.
3) Cisco IP Routing by Alex Zinin. A must read to anyone who wants in-depth understanding of IP Routing internals.

Jun
14

Flexible Packet Matching is a new feature that allows for granular packet inspection in Cisco IOS routers. Using FPM you can match any string, byte or even bit at any position in the IP (or theoretically non-IP) packet. This may greatly aid in identifying and blocking network attacks using static patterns found in the attack traffic. This feature has some limitation though.

a) First, it is completely stateless, e.g. does not track the state/history of the packet flow. Thus, FPM cannot discover dynamic protocol ports such as use by H.323 or FTP nor cannot it detect patterns split across multiple packets. Essentially, you are allowed to apply inspection per-packet basis only.

b) Additionally, you cannot apply FPM to the control-plane traffic, as the feature is implemented purely in CEF switching layer. Fragmented traffic is not assembled for matching, and the only inspected packet is the initial fragment of the IP packet flow.

c) IP packets with IP options are not matched by FPM as well, because they are punted to the route processor.

d) Lastly, this feature inspects only unicast packets and does not apply to MPLS encapsulated packets.

Configuring an FPM filter consists of a few steps.

(1) Loading protocol headers.
(2) Defining a protocol stack.
(3) Defining a traffic filter.
(4) Applying the policy & Verifying

Let’s look at every of these steps in depth.

Loading a PHDF (optional). PHDF stands for Packet Header Definition file. Those files use XML syntax and define the structure of various packet headers, such as Ethernet, IP, TCP, UDP and so on. They are very helpful in making filtering more structured, as with the PHDFs loaded you may filter based on the header field names and their values, instead of matching fixed offsets in the unstructured packet body. You may load the files into the router’s memory using the command load protocol . The PDHF files could be manually created using the simple XML syntax or downloaded from CCO. Since IOS version 12.4(15)T, four basic PHDFs are included in the IOS code and located at the virtual path system:fpm/phdfs. You could load the files directly from there. Defining a custom PHDF requires understanding of protocol header formats and field values along with basic XML formatting, which is beyond the scope of this document.

Defining a protocol stack (optional). This step uses the PDHFs loaded previously and allows specifying the protocol headers found in the traffic you want to inspect. Using the protocol stack definition induces structure in the packets being inspected. This allows for filtering based on header field values and specifying offsets in the packet relative to the header fields. Additionally, you may define various protocol stacks (e.g. UDP in IP, UDP in GRE in IP) and reuse the same access control policy with various stacks.

You define the protocol stack using the command class-map type-stack match-all <NAME>. This class-map it is always of type “match-all” and consists of a series of the match entries. Every match entry should specify a protocol name defined in loaded PHDFs, a header field value and the next protocol name found in stack. Look at the sample below – it defines the series of headers: TCP in IPIP tunnel headed (protocol 4) encapsulated in IP and in the Ethernet.

fpm-fig2

The order of the match statements is important – it defines the actual header stacking along with the “layer x” keywords. The layers are counted bottom up starting at the stack bottom and “layer 1” is the default. The other important thing is that every next match statement should define the protocol specified in the next clause of the previous match statement. This is a basic consistency check illustrated using the arrows in the diagram. Notice that most of the times you could just use a single-protocol stack, as you commonly deal with TCP or UDP protocols encapsulated in IP headers.

Now a few words about the match operator. This command uses the syntax match field <PROTO> <header-field> <operator> <value> . It is important to stress that the protocol and header fields are learned dynamically from PHDFS and this allows for high configuration flexibility. The operator could be “eq”, “neq”, “range”, “string”, “regex”, “lt” and “gt”. The operator names are self-explanatory, but the “eq” operator also supports a special “mask” parameter. The mask is interpreted as a bitmap, where “1” means the corresponding bit in the value could be any of “0” or “1” and “0” means the respective bit is “locked”.

By default, the protocol stack is matched stating AFTER the data-link level header. That is, the first header defined in the stack is match against the header going after the L2 header. In real life, this is commonly the IP header. However, sometime you may want to match the layer2 header fields as well, e.g. Ethernet addresses. To make the protocol stack match starting at L2 level, use the command stack-start l2-start.

At this point, having just stack class-maps you may already apply traffic filtering to the packets matching the configured stack. To accomplish this, create a policy-map of type “access-control” and assign a stack class to the policy-map. For example:

class-map type stack TCP_IN_IP_IN_ETHER
statck-start l2-start
match field ETHER type eq 0x800 next IP
match field layer 2 IP protocol eq 0x6 next TCP
!
policy-map type access-control DROP_TCP_IN_IP_IN_ETHER
class TCP_IN_IP_IN_ETHER
drop

There is basically just a a couple of actions available with the access-control policy-maps: drop and log. You may also use the send-response command under a class to make the router send an ICMP unreachable response.

Of course, using just stack class-maps might be inflexible, as you cannot match the packets payload. You may however use this for filtering based on addresses, protocol flags and so on.

Defining a traffic filter. Traffic filter is defined by means of special class-map of type “access-control” and configuring a respective policy-map of the same type. Using this type of class-maps you can match the protocol field values using the command match field <PROTO> if the respective protocol’s PHDF has been loaded. If you match the protocol fields, then the policy-map using the newly defined access-control class-map must be nested under the stack-type class-map defining this protocol. We’ll see an example later.

In addition to matching the protocol header fields, you can match the packet payload at a fixed offset against a pre-defined value, value range, string or regular expression. You may base the offset off the defined protocol header field (e.g. +10 bytes from TCP flags) using the command match start <PROTO> <FIELD-NAME> offset <OFFSET> size <SIZE>. For example:

match start TCP checksum offset 100 size 1 ...
match start IP version offset 0 size 1 ...

Of course, this type of offset “basing” is only possible if the respective protocol definition has been loaded and the containing stack-type class map has this protocol defined.

Irrespective of the protocols loaded/defined, you may base the offset from the L2 or L3 packet starts (absolute offsets). For example if the packet is IP in Ethernet, than L2-start is the first bit of the Ethernet header and L3-start is the first bit of the IP header. The command syntax is match start {l2-start|l3-start} offset <OFFSET-BYTES> size <SIZE-BYTES> <operator> <value>. The command specifies the offset in bytes from the selected start. If the size if less than or equal to 4 bytes, you can use the “eq, neq, lt, gt” operators in addition to “regex” and “string”. This allows for per-bit matching using the “eq” and “mask” operators combination. For example:

match start l3-start offset 0 size 1 eq 0x2 mask 0x3
match start l2-start offset 36 size 5 string ABCDE

Creating an access-control policy-map. You have two options here. You may create a simple access-control policy-map, which has just the basic access-control assigned, without nesting. For example:

class-map type access-control PASSWORD
match start l3-start offset 0 size 100 regex ”.*[pP][aA][sS][wW].* ”
!
policy-map type access-control DROP_PASSWORD
class PASSWORD
drop

When using this simple non-nested syntax, you may only use the absolute offsets with the command match {l2-start|l3-start} and cannot reference any protocol header fields.

You may use a more flexible approach, by nesting the filtering policy under a stack-type class-map configured in containing policy-map. This allows for using protocol headers in filtering policy or basing the offsets from the packet header fields. Notice that the nested filtering policy may only use the protocol headers defined in the containing stack class-map. Here is an example that looks for string “TEST” in TCP packets:

class-map type stack TCP_IN_IP
match field IP protocol eq 0x6 next TCP

!
! Protocol fields matched in the filtering policy
! must be for the protocols defined in the stack
! class-map
!
class-map type access-control match-any FILTER_CLASS
match start TCP payload offset 0 size 4 string TEST
!
policy-map type access-control FILTER_POLICY
class FILTER_CLASS
drop
!
policy-map type access-control STACK_POLICY
class TCP_IN_IP
service-policy FILTER_POLICY

Applying the traffic filtering policy. Finally, the access-control policy map should be applied to an interface either inbound or outbound using the interface-level command service-policy type access-control {input|output} <NAME>. This could be a simple non-nested policy using just stack-type class-maps or access-control class-maps or more advanced, stack and nested access-control policy-map.

And now, let’s look at a sample task. We would like to filter ICMP echo packets with the stings “AAAA” inside them, encapsulated inside IPIP tunnel. Here is the diagram for the testbed scenario:

fpm-fig11

And now, the configuration to be applied to R6's Fa0/0:

load protocol system:fpm/phdf/ip.phdf
load protocol system:fpm/phdf/icmp.phdf
load protocol system:fpm/phdf/ether.phdf

!
! Three protocols in stack: Ether/IP/IP
!
class-map type stack match-all ICMP_IN_IPIP_IN_ETHER
stack-start l2-start
match field ETHER type eq 0x800 next IP
match field layer 2 IP protocol eq 4 next IP
match field layer 3 IP protocol eq 0 mask 0xFF next ICMP

!
! Access-Control Policy, matches on the ICMP payload
!
class-map type access-control match-all ICMP_ECHO_STRING_AAAA
match field ICMP type eq 8
match start ICMP payload-start offset 0 size 256 regex ".*[aA][aA][aA][aA].*"

!
! Access-Control Policy-Map
!
policy-map type access-control LOG_ICMP_ECHO
class ICMP_ECHO_STRING_AAAA
log

!
! Outer policy map, matches the protocol stack.
!
policy-map type access-control STACK_POLICY
class ICMP_IN_IPIP_IN_ETHER
service-policy LOG_ICMP_ECHO

Next, a basic verification of our configuration. We send ICMP packets loaded with AAA characters (ASCII code 0x41) from SW1 to SW2 and see if they match the policy-map in R6:

Rack1SW1#ping 192.168.0.8 data 4141          

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.0.8, timeout is 2 seconds:
Packet has data pattern 0x4141
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/3/4 ms
Rack1SW1#
SCRack9AS>6
[Resuming connection 6 to R6 ... ]

Rack1R6#show policy-map type access-control interface fastEthernet 0/0
FastEthernet0/0

Service-policy access-control input: STACK_POLICY

Class-map: ICMP_IN_IPIP_IN_ETHER (match-all)
5 packets, 670 bytes
5 minute offered rate 0 bps
Match: field ETHER type eq 0x800 next IP
Match: field layer 2 IP protocol eq 4 next IP
Match: field layer 3 IP protocol eq 0 mask 0xFF next ICMP

Service-policy access-control : LOG_ICMP_ECHO

Class-map: ICMP_ECHO_STRING_AAAA (match-all)
5 packets, 670 bytes
5 minute offered rate 0 bps
Match: field ICMP type eq 8
Match: start ICMP payload-start offset 0 size 256 regex ".*[aA][aA][aA][aA].*"
log

Class-map: class-default (match-any)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any

Class-map: class-default (match-any)
0 packets, 0 bytes
5 minute offered rate 0 bps, drop rate 0 bps
Match: any

A quick final notice on use the logging statement with the access-control policy-maps. The rate-limiting mechanism for those is the same as used for the access-lists, and configured using the commands ip access-list log-update|logging.

Further Reading:

Flexible Packet Matching
FPM Deployment Guide

PS
The configuration in this post have been implemented and tested using IOS 12.4(24)T on a 2811 ISR. You may encounter bugs and errant behavior in earlier IOS releases, and I suggest you always using the latest image for FPM implementation.

Nov
04

NBAR protocol classification feature has long supported enhanced HTTP URL matching features. However, Cisco documentation site never provided a detailed description of the pattern language used for URL matching; neither has it explained how the engine matches client/server data streams. In this post we will give an overview of how NBAR works with URL filtering.

We will arrange this post in a FAQ manner as follows.

Q1: What is the syntax for matching the URLs?
A1: The syntax resembles regular expressions but it is actually not. Rather, it is more similar to using globs or wildcard special characters. The pattern you configured is matched against the string found after the “GET”, “POST” or “PUT” method in HTTP request packet. Note that NBAR is smart enough to remove the leading “/” in the file path. However, the HTTP request must end up with “\r\n” or the NBAR StILE (stateful inspection language engine) will not recognize it (an easy way to fool the inspection engine)!

Here is the list of the available wildcards:

“*” - match any pattern e.g. “aaa”, “abcd1234”. To match the substring “xyz” in the beginning of a string use the pattern “xyz*”. To match “xyz” anywhere in the string use the pattern “*xyz*”. Use the pattern “*xyz” to match the substring in the end. Note that pattern “xyz” matches only the exact string “xyz”. You may also use complicated patterns “ab*cd*ef” at the expense of some CPU penalty probably.
“?” - match any single character. For example pattern “???” matches “xyz”, “abc”,”efg”,”123”. You can mix “*” and “?” like in pattern “tes*.????”
”[]” - match range of characters. E.g. “[abc]” will match any single character “a”, “b” or “c”.
”|” - alternative. Separate patterns with “|” in order to specify “OR” matching logic. For example “xyz|abc” matches either full “xyz” or “abc” strings. You may mix “|” with other globs like this “*xyz*|*abc*|*pqr*”. Note sure about the overall limit of using the “|” symbol but you’d better keep it to minimum, in order to make matching faster.
“()” - grouping. Denotes the boundaries of a sub-pattern. For example instead of “*.txt|*.bin” you can write “*.(txt|bin)”.

All matching is case insensitive. So the pattern “text” matches “TEXT” as well. The engine matches your URL pattern against the directory path and the file name in the URL. E.g. If the URL is “http://www.cisco.com/pub/uploads/image.jpeg” the URL matching will only use the “pub/uploads/image.jpeg” part of the URL. As a matter of fact, when you submit request like the above URL, it translates into the following headers (there are actually more, but this is the bare minimum):

GET /pub/uploads/image.jpeg HTTP/1.1
Host: www.cisco.com

Q2: What if I want to match the host name used in the URL?
A2: You need to match the “Host” header field then. Use the match protocol http host [pattern]. You can use the same glob patterns for matching that you use to matching the filename in the URL.

Here is a good example: Using NBAR for application filtering

Q3: How NBAR actually classifies the traffic flows?
A3: When you apply a policy-map that contains a class with “match protocol” statement, the system starts NBAR classification engine on the interface. Any packet, be it ingress or ingress, passes the NBAR inspection engine provided that it passes the basic filters like matching the port number assigned to the protocol. You can change the port map using the global command ip nbar port-map [protocol].

When the engine sees a TCP SYN packet for the matching session, it starts the internal state machine, trying to parse the packet flow. Every new packet in the flow (in any direction) is inspected. Note that the NBAR does not classify a flow instantly. It may take some packet exchange until the engine determines that the flow matches specified criteria. As soon as there is enough information about the flow to classify it, the engine “tags” the bi-directional flow with the corresponding class value and reports this decision to the policy-map. Note that this classification applies in both directions – that is to the packets belonging to the flow and heading either ingress or egress.

For example, if a user starts a web sessions ands opens an URL matching any of your NBAR criteria, the engine will classify the flow as soon as it sees the packet with the URL string. After this, both packet flows from the client to server or from the server to the client would the respective class – that is, the policy map could be applied in either direction of the interface.

The engine will remove the “tag” when it faces the flow “end” criteria: e.g. it catches a TCP FIN or TCP RST flag. After this, flow is no longer monitored by NBAR and reported as matching the respective class.

Consider the example below. In this scenario, Fa0/0 is the outside interface, facing the internet. User’s traffic flows out of this interface.

class-map match-all TEST
match protocol http url "*.(t?xt|ocx|ex[ea])"
!
policy-map TEST
class TEST
drop
!
interface FastEthernet0/0
ip address 155.1.37.3 255.255.255.0
service-policy input TEST

However, as soon as user opens the URL matching the class-map specification, the engine will classify the flow as matching the class “TEST”. After this, all returning packets (server to client) for this flow will be dropped by the policy map.

Note that even though this works, it is not the recommended way of applying the URL filtering. This is because the actual GET requests will still get to the destination server and generate response traffic. In some situations, the client may send a FIN packet when server generates the response. At this point, NBAR will report the flow as destroyed and the policy-map will not match the packets. If the server response is slow enough, it will actually bypass the ingress filter! Thus, it is always recommended to apply the NBAR classification and policy action in the direction matching the direction of the protocol commands (e.g. apply the URL filtering in the direction of GET commands).

Q4: Are there any other matching criteria besides matching the URL or Host name?
A4: Yes, you can match a number of client/server HTTP headers. Better then that, you can match content-type in server replies, e.g.

match protocol http mime text/plain.

However this type of matching relies on the server to return the proper MIME encoding type in response using the “Content-Type” header. The benefit is the ability to block a whole class of files (e.g. image/jpeg) with a single match criterion.

Q5: Is NBAR good for production deployment?
A5: No. In fact, NBAR is OK if you need some quick and dirty hack, but it is not to be used as a content filtering engine. There is a bunch of reasons:

1) NBAR inspection consumes too much router CPU resources.
2) NBAR inspection is easy to trick. For example by simply avoiding “\r\n” in the end of the string you would make the engine believe it is not an URL request.
3) NBAR engine is buggy. Sometimes you may found your class matches much more traffic than you wanted and drops really important packets. Ooops! This happens all the times.
4) NBAR does not perform true TCP stream reassembly, removing duplicates and assembling fragments. This makes it totally vulnerable to any kind of tricky TCP/IP attacks.

So if you want some real content security, use the specialized solutions :)

May
08

Hi Brian,

Can we use NBAR on the gateway router to prevent internal users from watching video streams from any video web site (like Youtube.com)?

Ahmed

Hi Ahmed,

Yes, NBAR can be used to apply application based filters such as blocking youtube.com traffic. To accomplish this we can categorize traffic based on the HTTP hostname. Next we will create a policy-map that matches the youtube.com class and drops the traffic. Lastly the policy is applied outbound to the Internet. Syntax-wise this would read:

R1#
class-map match-all YOUTUBE
match protocol http host "*youtube.com*"
!
policy-map DROP_YOUTUBE
class YOUTUBE
drop
!
interface FastEthernet0/0
description TO INTERNET
service-policy output DROP_YOUTUBE

NBAR for HTTP can also be used to match based on URL string or IANA MIME type. For more information see:

Network-Based Application Recognition and Distributed Network-Based Application Recognition

May
05

Hi Brian,

I'm having a problem with Workbook Volume 1 Version 4.1. ORF (Outbound Route Filtering) isn't working for me. Any help would be appreciated.

Thank you,

JoeT

Hi Joe,

First off let’s talk a little bit about what BGP ORF (Outbound Route Filtering) is designed to do for us, and then we’ll take a look at some implementation examples.

From a customer’s point of view there are typically a limited amount of choices for what routes you can receive from your Service Provider via BGP. Usually the Service provider will give the customer the option of sending them a full table view (currently about 260,000 prefixes), just a default route, or some specific subset of the table such as a default route and the Service Provider’s locally originated prefixes. In other words a BGP Service Provider generally will not implement a complex outbound filtering policy for the customer. Instead, if the customer wants to receive just a subset view of the BGP table, the Customer Edge (CE) router has to filter prefixes inbound as they are received from the upstream Provider Edge (PE) router.

From the SP’s point of view this is the optimal design for administration. They don’t need to worry about change requests constantly coming from the customer about what routes they want to see and what routes they don’t want to see. Likewise from the customer’s point of view this is the optimal administrative design, as they do not need to send change control requests to the provider, and can arbitrarily change their filtering design on the fly. However from a device resource point of view this is not optimal from both the PE and CE routers’ perspective. The SP’s PE router must still send the full BGP table to the customer, even if the CE router filters out 99% of it. Likewise the CE router must still process all of the BGP UPDATE messages, even if the majority of them are ultimately filtered out.

Let’s take this a look at the result of this in the context of the following design:

AS100 -- AS200
(PE) -– (CE)

AS 200 has an upstream peering to its Service Provider, AS 100. The BGP table of AS 200 appears as follows:

AS200_CE#show ip bgp
BGP table version is 12, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0 10.0.0.100 0 0 100 i
*> 28.119.16.0/24 10.0.0.100 0 100 54 i
*> 28.119.17.0/24 10.0.0.100 0 100 54 i
*> 112.0.0.0 10.0.0.100 0 100 54 50 60 i
*> 113.0.0.0 10.0.0.100 0 100 54 50 60 i
*> 114.0.0.0 10.0.0.100 0 100 54 i
*> 115.0.0.0 10.0.0.100 0 100 54 i
*> 116.0.0.0 10.0.0.100 0 100 54 i
*> 117.0.0.0 10.0.0.100 0 100 54 i
*> 118.0.0.0 10.0.0.100 0 100 54 i
*> 119.0.0.0 10.0.0.100 0 100 54 i

Let’s suppose that from AS 200’s perspective the only routes that they want to receive from AS 100 are the default route plus the networks 28.119.16.0/24 and 28.119.17.0/24. Traditional filtering would dictate that on the CE router a prefix-list would be configured and applied as follows:

router bgp 200
neighbor 10.0.0.100 remote-as 100
neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
!
ip prefix-list AS_100_INBOUND seq 5 permit 0.0.0.0/0
ip prefix-list AS_100_INBOUND seq 10 permit 28.119.16.0/24
ip prefix-list AS_100_INBOUND seq 15 permit 28.119.17.0/24

The result of this configuration in AS 200’s BGP table is as follows:

AS200_CE#show ip bgp
BGP table version is 4, local router ID is 10.0.0.200
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0 10.0.0.100 0 0 100 i
*> 28.119.16.0/24 10.0.0.100 0 100 54 i
*> 28.119.17.0/24 10.0.0.100 0 100 54 i

Although the filtering goal is achieved, efficiency is not. From the below debug output we can see exactly how AS 200’s CE router processes the updates from the upstream PE and makes a decision on what to install:

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 115.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 119.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 118.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 117.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 116.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54 50 60
BGP(0): 10.0.0.100 rcvd 113.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): 10.0.0.100 rcvd 112.0.0.0/8 -- DENIED due to: distribute/prefix-list;
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
AS200_CE#

Note that the AS200_CE router generates the log message “DENIED due to: distribute/prefix-list;” for every prefix that is filtered. This means that if this were the public BGP table of 260,000+ routes this router would have to process every update message just to discard it. This is where BGP Outbound Route Filtering (ORF) comes in.

Outbound Route Filtering Capability for BGP-4 is currently an IETF draft (http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt) that describes an optimization on how prefix filtering can occur between a Customer Edge (CE) router and a Provider Edge (PE) router that are exchanging IPv4 unicast BGP prefixes. In the design we saw above the upstream PE router sent the full BGP table downstream to the CE router, and filtering was applied inbound on the downstream CE. With BGP ORF the downstream CE router dynamically tells the upstream PE router what routes to filter *outbound*. This means that the downstream CE router will only receive update messages about the prefixes that it wants.

Implementation wise the first step of this feature is for the BGP neighbors to negotiate that they both support the BGP ORF capability. Configuration-wise this looks as follows:

AS100_PE#
router bgp 100
neighbor 10.0.0.200 remote-as 200
!
address-family ipv4
neighbor 10.0.0.200 capability orf prefix-list receive
neighbor 204.12.25.254 activate
exit-address-family

AS200_CE#
router bgp 200
neighbor 10.0.0.100 remote-as 100
!
address-family ipv4
neighbor 10.0.0.100 capability orf prefix-list send
neighbor 10.0.0.100 prefix-list AS_100_INBOUND in
exit-address-family
!

The result of this configuration on AS 200’s CE is the same, however the behind the scenes mechanism by which it is accomplished is different. First, AS100_PE and AS200_CE negotiate the BGP ORF capability during initial BGP peering establishment. The success of this negotiation can be seen as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 | begin AF-dependant capabilities:
AF-dependant capabilities:
Outbound Route Filter (ORF) type (128) Prefix-list:
Send-mode: received
Receive-mode: advertised
Outbound Route Filter (ORF): received (3 entries)
Sent Rcvd
Prefix activity: ---- ----
Prefixes Current: 2 0
Prefixes Total: 2 0
Implicit Withdraw: 0 0
Explicit Withdraw: 0 0
Used as bestpath: n/a 0
Used as multipath: n/a 0

Outbound Inbound
Local Policy Denied Prefixes: -------- -------
ORF prefix-list: 8 n/a
Total: 8 0
Number of NLRIs in the update sent: max 4, min 2

*OUTPUT OMITTED*

AS200_CE#show ip bgp neighbors 10.0.0.100 | begin AF-dependant capabilities:
AF-dependant capabilities:
Outbound Route Filter (ORF) type (128) Prefix-list:
Send-mode: advertised
Receive-mode: received
Outbound Route Filter (ORF): sent;
Incoming update prefix filter list is AS_100_INBOUND
Sent Rcvd
Prefix activity: ---- ----
Prefixes Current: 0 3 (Consumes 156 bytes)
Prefixes Total: 0 4
Implicit Withdraw: 0 1
Explicit Withdraw: 0 0
Used as bestpath: n/a 3
Used as multipath: n/a 0

Outbound Inbound
Local Policy Denied Prefixes: -------- -------
Suppressed duplicate: 0 1
Bestpath from this peer: 3 n/a
Total: 3 1
Number of NLRIs in the update sent: max 0, min 0

*OUTPUT OMITTED*

Next, AS 200’s CE router tells AS 100’s PE router which prefixes it wants to receive. The logic of this configuration is that AS 200 is “sending” a prefix-list of what routes it wants, while AS 100 is “receiving” the prefix-list of what the downstream neighbor wants. The reception of the prefix-list by the upstream PE can be verified as follows.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 3 entries
seq 5 permit 0.0.0.0/0
seq 10 permit 28.119.16.0/24
seq 15 permit 28.119.17.0/24

AS100_PE#show ip prefix-list

Note that AS 100’s PE router received the list from AS 200’s CE, but the prefix-list does not show up locally in the running config. AS 100’s PE router then turns around and uses the prefix-list as an outbound filter towards the downstream CE. This can be verified two ways, by viewing the UPDATE messages on the downstream CE, and by looking at what the upstream PE is sending.

AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

Network Next Hop Metric LocPrf Weight Path
*> 28.119.16.0/24 204.12.25.254 0 0 54 i
*> 28.119.17.0/24 204.12.25.254 0 0 54 i

Total number of prefixes 2
AS100_PE#

AS200_CE#debug ip bgp updates
BGP updates debugging is on for address family: IPv4 Unicast
AS200_CE#clear ip bgp 100
AS200_CE#
BGP(0): no valid path for 0.0.0.0/0
BGP(0): no valid path for 28.119.16.0/24
BGP(0): no valid path for 28.119.17.0/24
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Down User reset
BGP(0): nettable_walker 0.0.0.0/0 no best path
BGP(0): nettable_walker 28.119.16.0/24 no best path
BGP(0): nettable_walker 28.119.17.0/24 no best path
%BGP-5-ADJCHANGE: neighbor 10.0.0.100 Up
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0
BGP(0): Revise route installing 1 of 1 routes for 0.0.0.0/0 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24
BGP(0): Revise route installing 1 of 1 routes for 28.119.16.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): Revise route installing 1 of 1 routes for 28.119.17.0/24 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

Note that the above output is different from the previous debug of AS 200’s CE, because now it does not receive the extra update messages. AS 200 instead now receives only the routes that it has requested of the upstream PE.

If edits of the filter are required the downstream CE can change the prefix-list, and then through the BGP Route Refresh capability, advertise the new prefix-list upstream to the PE to be used as a new downstream filter. Configuration wise this is accomplished as follows.

AS200_CE#conf t
Enter configuration commands, one per line. End with CNTL/Z.
AS200_CE(config)#ip prefix-list AS_100_INBOUND permit 114.0.0.0/8
AS200_CE(config)#end
AS200_CE#
%SYS-5-CONFIG_I: Configured from console by console
AS200_CE#clear ip bgp 100 in prefix-filter
AS200_CE#
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 114.0.0.0/8
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, path 100 54
BGP(0): 10.0.0.100 rcvd 28.119.17.0/24...duplicate ignored
BGP(0): 10.0.0.100 rcvd 28.119.16.0/24...duplicate ignored
BGP(0): Revise route installing 1 of 1 routes for 114.0.0.0/8 -> 10.0.0.100(main) to main IP table
BGP(0): 10.0.0.100 rcvd UPDATE w/ attr: nexthop 10.0.0.100, origin i, metric 0, path 100
BGP(0): 10.0.0.100 rcvd 0.0.0.0/0...duplicate ignored
AS200_CE#

From the “debug ip bgp updates” output we can now see that the upstream PE added the update 114.0.0.0/8, in addition to the previous three prefixes that were installed. Upstream verification on the PE also indicates this.

AS100_PE#show ip bgp neighbors 10.0.0.200 received prefix-filter
Address family: IPv4 Unicast
ip prefix-list 10.0.0.200: 4 entries
seq 5 permit 0.0.0.0/0
seq 10 permit 28.119.16.0/24
seq 15 permit 28.119.17.0/24
seq 20 permit 114.0.0.0/8
AS100_PE#show ip bgp neighbors 10.0.0.200 advertised-routes
BGP table version is 11, local router ID is 10.0.0.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Originating default network 0.0.0.0

Network Next Hop Metric LocPrf Weight Path
*> 28.119.16.0/24 204.12.25.254 0 0 54 i
*> 28.119.17.0/24 204.12.25.254 0 0 54 i
*> 114.0.0.0 204.12.25.254 0 54 i

Total number of prefixes 3
AS100_PE#

For more information on this feature:

Outbound Route Filtering Capability for BGP-4
http://www.ietf.org/internet-drafts/draft-ietf-idr-route-filter-16.txt

BGP Prefix-Based Outbound Route Filtering
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t11/feature/guide/ft11borf.html

Feb
19

UPDATE: For more information on Redistribution see the video series Understanding Route Redistribution – Excerpts from CCIE R&S ATC

Simple Redistribution Step-by-Step

We're going to take our basic topology from the previous post Understanding Redistribution Part I , and configure to provide full connectivity between all devices with the most simple configuration. Then we are going to tweak some settings and see how they affect redistribution and optimal routing. This is going to be an introductory example to illustrate the redistribution control techniques mentioned previously.

First, we configure IGP routing per the diagram, and advertise Loopback0 interfaces (150.X.Y.Y/24, where X is the rack number, and Y is the router number) into IGP. Specifically, R2, R3 and R4 Loopbacks are advertised into OSPF Area 0. R5 and R6 Loopbacks are advertised into EIGRP 356 and R1 Loopback interface is simply advertised into EIGRP 123.

Next, we propose OSPF to be the core routing domain, used as transit path to reach any other domains. All other domains, in effect, will be connected as stub (non-transit) to the core domain. We start with EIGRP 123 and OSPF domains, and enable mutual redistribution between them on R2 and R3 routers:

R2:
!
! Metric values are chosen just to be easy to type in
! No big secret rules of thumb behind this one
!
router eigrp 123
redistribute ospf 1 metric 1 1 1 1 1
!
router ospf 1
redistribute eigrp 123 subnets

R3:
router eigrp 123
redistribute ospf 1 metric 1 1 1 1 1
!
router ospf 1
redistribute eigrp 123 subnets

In effect, we would expect both routing domains to become transit for each other. However, EIGRP has a really nice feature of assigning default AD value of 170 to external routes. This, in turn, prevents circular redistribution - all OSPF routes injected into EIGRP 123 domain have AD of 170, and are effectively stopped from being advertised back into OSPF, since native OSPF routes preempt them. Let's see the routing table states:

Rack18R1#show ip route eigrp
174.18.0.0/24 is subnetted, 2 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.123.3, 00:02:03, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:03, FastEthernet0/0
150.18.0.0/24 is subnetted, 4 subnets
D EX 150.18.4.0
[170/2560002816] via 174.18.123.3, 00:02:03, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:03, FastEthernet0/0
D EX 150.18.2.0
[170/2560002816] via 174.18.123.3, 00:02:03, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:03, FastEthernet0/0
D EX 150.18.3.0
[170/2560002816] via 174.18.123.3, 00:02:03, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:03, FastEthernet0/0

Just as we would expect, R1 has two routes from each OSPF prefix, since both R2 and R3 assign the same seed metric to redistributed routes.

Rack18R2#show ip route | beg Gate
Gateway of last resort is not set

174.18.0.0/24 is subnetted, 2 subnets
C 174.18.234.0 is directly connected, Serial0/0
C 174.18.123.0 is directly connected, FastEthernet0/0
150.18.0.0/24 is subnetted, 4 subnets
O 150.18.4.0 [110/65] via 174.18.234.4, 00:00:19, Serial0/0
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:00:13, FastEthernet0/0
C 150.18.2.0 is directly connected, Loopback0
O 150.18.3.0 [110/65] via 174.18.234.3, 00:00:19, Serial0/0

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/782] via 174.18.234.4, 00:00:54, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:17:51, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:17:49, FastEthernet0/1
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:00:48, FastEthernet0/0
O 150.18.2.0 [110/782] via 174.18.234.2, 00:00:54, Serial1/0
C 150.18.3.0 is directly connected, Loopback0

All seems to be fine here too; thanks to EIGRP AD value of 90, EIGRP "native" prefixes are reachable via EIGRP, and OSPF native subnets are reachable via OSPF (since EIGRP External AD value is 170). Next, we move a step further, and redistribute between OSPF and EIGRP 356 domains, i.e. attach the latter domain to the "core":

R3:
router eigrp 356
redistribute ospf 1 metric 1 1 1 1 1
!
router ospf 1
redistribute eigrp 123 subnets
redistribute eigrp 356 subnets

Since EIGRP 356 domain has just one point of attachement to the core, there should be no big problems. Look at the routing table of R1:

Rack18R1#show ip route eigrp
174.18.0.0/24 is subnetted, 3 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.123.3, 00:05:52, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:52, FastEthernet0/0
D EX 174.18.0.0
[170/2560002816] via 174.18.123.2, 00:01:36, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
D EX 150.18.4.0
[170/2560002816] via 174.18.123.3, 00:05:52, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:52, FastEthernet0/0
D EX 150.18.5.0
[170/2560002816] via 174.18.123.2, 00:01:36, FastEthernet0/0
D EX 150.18.6.0
[170/2560002816] via 174.18.123.2, 00:01:36, FastEthernet0/0
D EX 150.18.2.0
[170/2560002816] via 174.18.123.3, 00:05:52, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:52, FastEthernet0/0
D EX 150.18.3.0
[170/2560002816] via 174.18.123.3, 00:05:52, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:52, FastEthernet0/0

Ah, great. R1 sees R5 and R6 loopback as they are advertised by R2 only. Naturally, redistribution between local routing processes on R3 is non-transitive - when we inject EIGRP 356 routes into OSPF, they are not further propagated into EIGRP 123, even though OSPF is redistributed into EIGRP 123. So R1 packets would have to transit OSPF domain, in order to reach R5 and R6.

R2 sees EIGRP 356 routes reachable via R3:

Rack18R2#show ip route | beg Gate
Gateway of last resort is not set

174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial0/0
O E2 174.18.0.0 [110/20] via 174.18.234.3, 00:02:20, Serial0/0
C 174.18.123.0 is directly connected, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/65] via 174.18.234.4, 00:06:41, Serial0/0
O E2 150.18.5.0 [110/20] via 174.18.234.3, 00:02:20, Serial0/0
O E2 150.18.6.0 [110/20] via 174.18.234.3, 00:02:20, Serial0/0
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:06:35, FastEthernet0/0
C 150.18.2.0 is directly connected, Loopback0
O 150.18.3.0 [110/65] via 174.18.234.3, 00:06:41, Serial0/0

And R3 in turn sees them as EIGRP 356 native ones:

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/782] via 174.18.234.4, 00:02:47, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:02:36, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:02:36, FastEthernet0/1
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:02:34, FastEthernet0/0
O 150.18.2.0 [110/782] via 174.18.234.2, 00:02:47, Serial1/0
C 150.18.3.0 is directly connected, Loopback0

Fine, now to finish the picture, we redistribute between RIP and OSPF on R4

R4:
router ospf 1
redistribute rip subnets
!
! Assign some large metric value to redistributed routes
!
router rip
redistribute ospf 1 metric 8

Check to see the full routing table of R1:

Rack18R1#show ip route eigrp
D EX 222.22.2.0/24
[170/2560002816] via 174.18.123.3, 00:02:16, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:16, FastEthernet0/0
D EX 220.20.3.0/24
[170/2560002816] via 174.18.123.3, 00:02:16, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:16, FastEthernet0/0
174.18.0.0/24 is subnetted, 3 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.123.3, 00:05:17, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:17, FastEthernet0/0
D EX 174.18.0.0
[170/2560002816] via 174.18.123.2, 00:08:33, FastEthernet0/0
D EX 192.10.18.0/24
[170/2560002816] via 174.18.123.3, 00:02:16, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:16, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
D EX 150.18.4.0
[170/2560002816] via 174.18.123.3, 00:05:17, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:17, FastEthernet0/0
D EX 150.18.5.0
[170/2560002816] via 174.18.123.2, 00:05:14, FastEthernet0/0
D EX 150.18.6.0
[170/2560002816] via 174.18.123.2, 00:05:15, FastEthernet0/0
D EX 150.18.2.0
[170/2560002816] via 174.18.123.3, 00:05:20, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:20, FastEthernet0/0
D EX 150.18.3.0
[170/2560002816] via 174.18.123.3, 00:05:20, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:05:20, FastEthernet0/0
D EX 205.90.31.0/24
[170/2560002816] via 174.18.123.3, 00:02:19, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:02:19, FastEthernet0/0

Seems like we have connectivity to all prefixes, even the ones injected by BB2 (205.90.31.0/24 etc). All of them, with except to EIGRP 356 routes are symmetrically reachable via R2 and R3. Now, a long snapshot of all other routing tables:

Rack18R2#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [110/20] via 174.18.234.4, 00:03:29, Serial0/0
O E2 220.20.3.0/24 [110/20] via 174.18.234.4, 00:03:29, Serial0/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial0/0
O E2 174.18.0.0 [110/20] via 174.18.234.3, 00:03:29, Serial0/0
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [110/20] via 174.18.234.4, 00:03:29, Serial0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/65] via 174.18.234.4, 00:03:29, Serial0/0
O E2 150.18.5.0 [110/20] via 174.18.234.3, 00:03:29, Serial0/0
O E2 150.18.6.0 [110/20] via 174.18.234.3, 00:03:29, Serial0/0
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:19:36, FastEthernet0/0
C 150.18.2.0 is directly connected, Loopback0
O 150.18.3.0 [110/65] via 174.18.234.3, 00:03:29, Serial0/0
O E2 205.90.31.0/24 [110/20] via 174.18.234.4, 00:03:29, Serial0/0

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [110/20] via 174.18.234.4, 00:03:56, Serial1/0
O E2 220.20.3.0/24 [110/20] via 174.18.234.4, 00:03:56, Serial1/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [110/20] via 174.18.234.4, 00:03:56, Serial1/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/782] via 174.18.234.4, 00:03:56, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:06:58, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:06:58, FastEthernet0/1
D 150.18.1.0 [90/156160] via 174.18.123.1, 00:06:57, FastEthernet0/0
O 150.18.2.0 [110/782] via 174.18.234.2, 00:03:56, Serial1/0
C 150.18.3.0 is directly connected, Loopback0
O E2 205.90.31.0/24 [110/20] via 174.18.234.4, 00:03:56, Serial1/0

Rack18R4#show ip route | beg Gate
Gateway of last resort is not set

R 222.22.2.0/24 [120/7] via 192.10.18.254, 00:00:19, FastEthernet0/0
R 220.20.3.0/24 [120/7] via 192.10.18.254, 00:00:19, FastEthernet0/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial0/0
O E2 174.18.0.0 [110/20] via 174.18.234.3, 00:05:11, Serial0/0
O E2 174.18.123.0 [110/20] via 174.18.234.3, 00:05:11, Serial0/0
[110/20] via 174.18.234.2, 00:05:11, Serial0/0
C 192.10.18.0/24 is directly connected, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
C 150.18.4.0 is directly connected, Loopback0
O E2 150.18.5.0 [110/20] via 174.18.234.3, 00:05:11, Serial0/0
O E2 150.18.6.0 [110/20] via 174.18.234.3, 00:05:11, Serial0/0
O E2 150.18.1.0 [110/20] via 174.18.234.3, 00:05:11, Serial0/0
[110/20] via 174.18.234.2, 00:05:11, Serial0/0
O 150.18.2.0 [110/65] via 174.18.234.2, 00:05:11, Serial0/0
O 150.18.3.0 [110/65] via 174.18.234.3, 00:05:11, Serial0/0
R 205.90.31.0/24 [120/7] via 192.10.18.254, 00:00:20, FastEthernet0/0

A quick stop at R5. Since RIPv2 AD is 120, and EIGRP External AD is 170, R5 sees all non EIGRP 356 routes via RIP. Not a big problem right now, but it creates "suboptimal" paths to EIGRP 123 routes, since it's more, well, effective to reach them over Ethernet links.

Rack18R5#sh ip route | beg Gate
Gateway of last resort is not set

R 222.22.2.0/24 [120/7] via 192.10.18.254, 00:00:25, FastEthernet0/1
R 220.20.3.0/24 [120/7] via 192.10.18.254, 00:00:25, FastEthernet0/1
174.18.0.0/24 is subnetted, 3 subnets
R 174.18.234.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
C 174.18.0.0 is directly connected, FastEthernet0/0
R 174.18.123.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
C 192.10.18.0/24 is directly connected, FastEthernet0/1
150.18.0.0/24 is subnetted, 6 subnets
R 150.18.4.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
C 150.18.5.0 is directly connected, Loopback0
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:43:41, FastEthernet0/0
R 150.18.1.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
R 150.18.2.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
R 150.18.3.0 [120/8] via 192.10.18.4, 00:00:04, FastEthernet0/1
R 205.90.31.0/24 [120/7] via 192.10.18.254, 00:00:25, FastEthernet0/1

We may opt to craeate an access-list, select all RIP routes, and freeze their AD down to 120. Next we can rais global RIP distance to something bigger than 170, to fix this issue. Not a big deal, though! OK, so the only left are R6 and BB2:

Rack18R6#show ip route eigrp
D EX 222.22.2.0/24 [170/2560002816] via 174.18.0.3, 00:07:29, FastEthernet0/0
D EX 220.20.3.0/24 [170/2560002816] via 174.18.0.3, 00:07:29, FastEthernet0/0
174.18.0.0/24 is subnetted, 2 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.0.3, 00:10:31, FastEthernet0/0
D EX 192.10.18.0/24 [170/2560002816] via 174.18.0.3, 00:07:29, FastEthernet0/0
150.18.0.0/24 is subnetted, 5 subnets
D EX 150.18.4.0 [170/2560002816] via 174.18.0.3, 00:10:31, FastEthernet0/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:41:30, FastEthernet0/0
D EX 150.18.2.0 [170/2560002816] via 174.18.0.3, 00:10:31, FastEthernet0/0
D EX 150.18.3.0 [170/2560002816] via 174.18.0.3, 00:10:31, FastEthernet0/0
D EX 205.90.31.0/24 [170/2560002816] via 174.18.0.3, 00:07:29, FastEthernet0/0

BB2>sh ip ro rip
174.18.0.0/24 is subnetted, 3 subnets
R 174.18.234.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 174.18.0.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 174.18.123.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
150.18.0.0/24 is subnetted, 6 subnets
R 150.18.4.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 150.18.5.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 150.18.6.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 150.18.1.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 150.18.2.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0
R 150.18.3.0 [120/8] via 192.10.18.4, 00:00:24, Ethernet0

This seems to be an easy one. We're done, full connectivity attained, and no loops were introduced. Thanks to EIGRP External AD there is no need to manually filter routes on the border routes between EIGRP 123 and OSPF. But what if we are asked to redistribute some "native" EIGRP 123 prefixes (e.g. R1 Loopback0) instead of advertising them?

R1:
router eigrp 123
redistribute connected
network 174.18.123.1 0.0.0.0

See what happens:

Rack18R2#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [110/20] via 174.18.234.4, 00:15:13, Serial0/0
O E2 220.20.3.0/24 [110/20] via 174.18.234.4, 00:15:13, Serial0/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial0/0
O E2 174.18.0.0 [110/20] via 174.18.234.3, 00:15:13, Serial0/0
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [110/20] via 174.18.234.4, 00:15:13, Serial0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/65] via 174.18.234.4, 00:15:13, Serial0/0
O E2 150.18.5.0 [110/20] via 174.18.234.3, 00:15:13, Serial0/0
O E2 150.18.6.0 [110/20] via 174.18.234.3, 00:15:13, Serial0/0
D EX 150.18.1.0 [170/156160] via 174.18.123.1, 00:01:49, FastEthernet0/0
C 150.18.2.0 is directly connected, Loopback0
O 150.18.3.0 [110/65] via 174.18.234.3, 00:15:13, Serial0/0
O E2 205.90.31.0/24 [110/20] via 174.18.234.4, 00:15:13, Serial0/0

R2 sees R1 Loopback0 as EIGRP external route reachable via R1. However, the situation is different on R3:

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [110/20] via 174.18.234.4, 00:16:10, Serial1/0
O E2 220.20.3.0/24 [110/20] via 174.18.234.4, 00:16:10, Serial1/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [110/20] via 174.18.234.4, 00:16:10, Serial1/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [110/782] via 174.18.234.4, 00:16:10, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:19:12, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:19:12, FastEthernet0/1
O E2 150.18.1.0 [110/20] via 174.18.234.2, 00:02:46, Serial1/0
O 150.18.2.0 [110/782] via 174.18.234.2, 00:16:10, Serial1/0
C 150.18.3.0 is directly connected, Loopback0
O E2 205.90.31.0/24 [110/20] via 174.18.234.4, 00:16:10, Serial1/0

Thanks to R1 Loopback0 being redistributed into OSPF, R3 prefers it via R2, not R1, due to AD values. Not a very big problem in this scenario, but in more complex cases this may lead to routing loops, since suboptimal path may be chosen. In order to fix this, we may adjust AD for the EIGRP prefix. However, there is a problem here - we can't change EIGRP external AD selectively, only for all prefixes at once.

This why we may choose to play with AD under OSPF. We may simply adjust AD for R1 Loopback0 prefix to a value of 171. However, we may go further, and simulate the feature of EIGRP with OSPF - speficically, make sure OSPF internal routes have AD different from the external prefixes. For EIGRP the values are 90 and 170. What values should we pick up for OSPF? Well, since OSPF is the core routing domain, it has more information available on what's going around. Okay, and it's the only transit domain, so we should make sure it has the highest preference among others. So it makes sense to set internal/external AD for OSPF to 80 and 160 - lower than the EIGRP values. This ensure that internal routes are always preferred over external, but core routing protocol is always the most trusted one.

R2 & R3:
access-list 1 permit 150.18.1.0
!
router ospf 1
distance ospf intra-area 80
distance ospf inter-area 80
distance ospf external 160
distance 171 0.0.0.0 255.255.255.255 1

See how this works on R2 and R3:

Rack18R2#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [160/20] via 174.18.234.4, 00:00:07, Serial0/0
O E2 220.20.3.0/24 [160/20] via 174.18.234.4, 00:00:07, Serial0/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial0/0
O E2 174.18.0.0 [160/20] via 174.18.234.3, 00:00:35, Serial0/0
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [160/20] via 174.18.234.4, 00:00:35, Serial0/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [80/65] via 174.18.234.4, 00:00:35, Serial0/0
O E2 150.18.5.0 [160/20] via 174.18.234.3, 00:00:35, Serial0/0
O E2 150.18.6.0 [160/20] via 174.18.234.3, 00:00:35, Serial0/0
D EX 150.18.1.0 [170/156160] via 174.18.123.1, 00:03:10, FastEthernet0/0
C 150.18.2.0 is directly connected, Loopback0
O 150.18.3.0 [80/65] via 174.18.234.3, 00:00:35, Serial0/0
O E2 205.90.31.0/24 [160/20] via 174.18.234.4, 00:00:07, Serial0/0

OK, just as it was before, next comes R3:

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [160/20] via 174.18.234.4, 00:00:55, Serial1/0
O E2 220.20.3.0/24 [160/20] via 174.18.234.4, 00:00:55, Serial1/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [160/20] via 174.18.234.4, 00:01:14, Serial1/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [80/782] via 174.18.234.4, 00:01:14, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 00:37:14, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 00:37:14, FastEthernet0/1
D EX 150.18.1.0 [170/156160] via 174.18.123.1, 00:03:58, FastEthernet0/0
O 150.18.2.0 [80/782] via 174.18.234.2, 00:01:14, Serial1/0
C 150.18.3.0 is directly connected, Loopback0
O E2 205.90.31.0/24 [160/20] via 174.18.234.4, 00:00:56, Serial1/0

All seems to look great, with except to the fact that EIGRP 123 and EIGRP 356 have to transit OSPF domain in order to reach each other. Though this is only true for R1, still we need to find out a way to resolve this issue. What if we start exhanging routes on R3 between EIGRP 123 and EIGRP 356? This will not cause any new domain to become transit - because EIGRP external AD will block the external routes from being re-injected into the core domain. This is why we may safely turn on mutual redistribution between the two domains.

R3:
router eigrp 123
redistribute eigrp 356
!
router eigrp 356
redistribute eigrp 123

Observe the effect on R1 now. Note the metrics assigned to R5 and R6 Loopback prefixes - they are taken directly from EIGRP 356 native prefixe metric values.

Rack18R1#show ip route eigrp
D EX 222.22.2.0/24
[170/2560002816] via 174.18.123.3, 00:20:05, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:05, FastEthernet0/0
D EX 220.20.3.0/24
[170/2560002816] via 174.18.123.3, 00:20:05, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:05, FastEthernet0/0
174.18.0.0/24 is subnetted, 3 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.123.3, 00:35:22, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:35:22, FastEthernet0/0
D EX 174.18.0.0 [170/30720] via 174.18.123.3, 00:00:37, FastEthernet0/0
D EX 192.10.18.0/24
[170/2560002816] via 174.18.123.3, 00:20:33, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:33, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
D EX 150.18.4.0
[170/2560002816] via 174.18.123.3, 00:20:33, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:33, FastEthernet0/0
D EX 150.18.5.0 [170/158720] via 174.18.123.3, 00:00:38, FastEthernet0/0
D EX 150.18.6.0 [170/158720] via 174.18.123.3, 00:00:38, FastEthernet0/0
D EX 150.18.2.0
[170/2560002816] via 174.18.123.3, 00:20:25, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:25, FastEthernet0/0
D EX 150.18.3.0
[170/2560002816] via 174.18.123.3, 00:20:37, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:37, FastEthernet0/0
D EX 205.90.31.0/24
[170/2560002816] via 174.18.123.3, 00:20:09, FastEthernet0/0
[170/2560002816] via 174.18.123.2, 00:20:09, FastEthernet0/0

R6 also learns EIGRP 123 routes via R3:

Rack18R6#show ip route eigrp
D EX 222.22.2.0/24 [170/2560002816] via 174.18.0.3, 00:23:59, FastEthernet0/0
D EX 220.20.3.0/24 [170/2560002816] via 174.18.0.3, 00:23:59, FastEthernet0/0
174.18.0.0/24 is subnetted, 3 subnets
D EX 174.18.234.0
[170/2560002816] via 174.18.0.3, 01:00:18, FastEthernet0/0
D EX 174.18.123.0 [170/30720] via 174.18.0.3, 00:04:15, FastEthernet0/0
D EX 192.10.18.0/24 [170/2560002816] via 174.18.0.3, 00:27:22, FastEthernet0/0
150.18.0.0/24 is subnetted, 6 subnets
D EX 150.18.4.0 [170/2560002816] via 174.18.0.3, 00:27:22, FastEthernet0/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 01:31:17, FastEthernet0/0
D EX 150.18.1.0 [170/158720] via 174.18.0.3, 00:04:15, FastEthernet0/0
D EX 150.18.2.0 [170/2560002816] via 174.18.0.3, 00:24:18, FastEthernet0/0
D EX 150.18.3.0 [170/2560002816] via 174.18.0.3, 01:00:18, FastEthernet0/0
D EX 205.90.31.0/24 [170/2560002816] via 174.18.0.3, 00:23:59, FastEthernet0/0

R3 will not transit OSPF backbone, due to our choice of administrative distances:

Rack18R3#show ip route | beg Gate
Gateway of last resort is not set

O E2 222.22.2.0/24 [160/20] via 174.18.234.4, 02:22:45, Serial1/0
O E2 220.20.3.0/24 [160/20] via 174.18.234.4, 02:22:45, Serial1/0
174.18.0.0/24 is subnetted, 3 subnets
C 174.18.234.0 is directly connected, Serial1/0
C 174.18.0.0 is directly connected, FastEthernet0/1
C 174.18.123.0 is directly connected, FastEthernet0/0
O E2 192.10.18.0/24 [160/20] via 174.18.234.4, 02:23:04, Serial1/0
150.18.0.0/24 is subnetted, 6 subnets
O 150.18.4.0 [80/782] via 174.18.234.4, 02:23:04, Serial1/0
D 150.18.5.0 [90/156160] via 174.18.0.5, 02:59:03, FastEthernet0/1
D 150.18.6.0 [90/156160] via 174.18.0.6, 02:59:04, FastEthernet0/1
D EX 150.18.1.0 [170/156160] via 174.18.123.1, 02:25:48, FastEthernet0/0
O 150.18.2.0 [80/782] via 174.18.234.2, 02:23:04, Serial1/0
C 150.18.3.0 is directly connected, Loopback0
O E2 205.90.31.0/24 [160/20] via 174.18.234.4, 02:22:45, Serial1/0

So we come up with (mostly) optimal redistribution configuration for our topology. What we learned so far, is that Split-horizon rule may also be implemented using the Administrative Disatnces. EIGRP implements this extremely useful feature automatically, while OSPF should be manually tuned for that. However, as we will learn in further examples, some additional work is needed for RIP. Also, now we remember that redistribution is non-transitive on local router. Finally, we learned a way to introduce a kind of hierarchy of administrative distances for a star-like routing domain topology (core transit domain, and stub edge domains). More complicated examples are to follow this post.

Feb
13

Cisco IOS has a special feature called local policy routing, which permits to apply a route-map to local (router-generated) traffic. The first way we can use this feature is to re-circulate local traffic (and force it re-enter the router). Here's an example. By default, locally-generated packets are not inspected by outgoing access-lists. This may cause issues when local traffic is not being reflected under relfexive access-list entries. Say with configuration like that:

!
! Reflect all "session-oriented" traffic
!
ip access-list extended EGRESS
permit tcp any any reflect MIRROR
permit icmp any any reflect MIRROR
permit udp any any reflect MIRROR
!
! Evalute the reflected entries
!
ip access-list extended INGRESS
evaluate MIRROR
permit ospf any any
!
interface Serial 0/0
ip address 54.1.1.6 255.255.255.0
ip access-group INGRESS in
ip access-group EGRESS out

You would not be able to telnet out of a router to destinations behind the Serial interface, even though TCP sessions are reflected in access-list. To fix the issue, we may use local-policy to force the local traffic re-enter the router and be inspected by outgoing access-list:

!
! Redirect local telnet traffic via the Loopback interface
!
ip access-list extended LOCAL_TRAFFIC
permit tcp any any eq 23
!
route-map LOCAL_POLICY 10
match ip address LOCAL_TRAFFIC
set interface Loopback0
!
! Traffic sent to Loopback interface re-enters the router
!
interface Loopback0
ip address 150.1.6.6 255.255.255.50

!
! Apply the local-policy
!
ip local policy route-map LOCAL_POLICY

With this configuration, local telnet session will re-enter the router and hit the outgoing access-list, thereby triggering a reflected entry. This same idea may be utilized to force CBAC inspection of locally-generated traffic, by since 12.3T there has been a special IOS feature to do this natively.

The other useful application of local policy routing is using it for traffic filtering. For example you may want to prohibit outgoing telnet sessions from local router to a certain destination:

ip access-list extended BLOCK_TELNET
permit tcp any host 150.1.1.1 eq 23
!
route-map LOCAL_POLICY 10
match ip address BLOCK_TELNET
set interface Null 0

!
! Apply the local-policy
!
ip local policy route-map LOCAL_POLICY

The syntax is somewhat similar to the vlan access-maps used on Catalyst switches, and similarly the route-map is applied "globally", i.e. to all router traffic, going out on any interface. Note that you may use the same idea to block incoming session, simply by reversing entries in access-list. (e.g. "permit tcp any eq 23 host 150.1.1.1"). Best of all, with PBR you may apply additional criteria to incoming traffic, e.g. match packet sizes.

The last example is the use of local PBR to apply special treatment to management/control plane traffic - e.g. use different output interfaces for out-of-band management. With local PBR you may also apply special marking for control traffic, e.g. selectively assign IP precedence values.

ip access-list extended MANAGEMENT_TRAFFIC
permit tcp any eq 23 any
permit tcp any eq 22 any
!
route-map LOCAL_POLICY 10
match ip address MANAGEMENT_TRAFFIC
set interface Serial 0/1
set ip precedence 7
!
ip local policy route-map LOCAL_POLICY

Keep these simple features in mind, while considering options for you CCIE lab task solution.

Feb
06

Guys,

I ran into a task in a lab to configure an ACL to allow BGP and the book has it configured like this:

Permit tcp host 150.1.5.5 eq bgp host 150.1.4.4
Permit tcp host 150.1.5.5 host 150.1.4.4 eq bgp

Is there a reason why it is configured like that instead of ‘Permit tcp host 150.1.5.5 eq bgp host 150.1.4.4 eq bgp’ ?

The only reason I could think of was to allow source/destination traffic from tcp 179 to a different port.

Ted

Hi Ted,

Your reasoning is correct; it is because BGP uses different source and destination ports other than 179 depending on who originates the session. BGP is essentially a standard TCP based protocol, which means that it is client and server based.

When a TCP client attempts to establish a connection to a TCP server it first sends a TCP SYN packet to the server with the destination port as the well known port. This first SYN essentially is a request to open a session. If the server permits the session it will respond with a TCP SYN ACK saying that it acknowledges the request to open the session, and that it also wants to open the session. In this SYN ACK response the server uses the well known port as the source port, and a randomly negotiated destination port. The last step of the three way handshake is the client responding to the server with a TCP ACK, which acknowledges the server’s response and completes the connection establishment.

Now from the perspective of BGP specifically the TCP clients and servers are routers. When the “client” router initiates the BGP session is sends a request to the server with a destination port of 179 and a random source port X. The server then responds with a source port of 179 and a destination port of X. Therefore all client to server traffic uses destination 179, while all server to client traffic uses source 179. We can also verify this from the debug output in IOS.

In the below topology R1 and R2 are directly connected BGP peers on an Ethernet segment with configurations as follows:

R1:
interface Ethernet0/0
ip address 10.0.0.1 255.255.255.0
!
interface Loopback0
ip address 1.1.1.1 255.255.255.255
!
router bgp 1
network 1.1.1.1 mask 255.255.255.255
neighbor 10.0.0.2 remote-as 1

R2:
interface Ethernet0/0
ip address 10.0.0.2 255.255.255.0
!
interface Loopback0
ip address 2.2.2.2 255.255.255.255
!
router bgp 1
network 2.2.2.2 mask 255.255.255.255
neighbor 10.0.0.1 remote-as 1

From the “debug ip packet detail” output we can see the initial TCP session establishment for BGP between these neighbors as follows:

R1:
IP: tableid=0, s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), routed via RIB
IP: s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), len 44, rcvd 3
TCP src=11000, dst=179, seq=3564860120, ack=0, win=16384 SYN

R2:
IP: tableid=0, s=10.0.0.1 (Ethernet0/0), d=10.0.0.2 (Ethernet0/0), routed via RIB
IP: s=10.0.0.1 (Ethernet0/0), d=10.0.0.2 (Ethernet0/0), len 44, rcvd 3
TCP src=179, dst=11000, seq=1977460611, ack=3564860121, win=16384 ACK SYN

R1:
IP: tableid=0, s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), routed via RIB
IP: s=10.0.0.2 (Ethernet0/0), d=10.0.0.1 (Ethernet0/0), len 40, rcvd 3
TCP src=11000, dst=179, seq=3564860121, ack=1977460612, win=16384 ACK

In the first code snippet we can see that R1 has received a TCP packet from R2. This packet was sourced from 10.0.0.2 (R2), destined to 10.0.0.1 (R1), has a TCP source port of 11000 (the random port), a destination TCP port of 179 (the well known port), and is a SYN packet (1st step of three way handshake).

Next R2 receives the response from R1 that is a TCP SYN ACK, but in this case we can see that the port numbers are swapped. R1, the TCP server, now uses TCP port 179 as the source port, and the randomly negotiated port 11000 as the destination.

Further debugging of the BGP flow between these neighbors will show R1 using destination 179 (as the TCP client) and R2 using source 179 (as the TCP server).

The implication of this operation is that if BGP needs to be matched for some reason, i.e. to filter it out in an access-list, to match it in a QoS policy, we must account for traffic that is both going to TCP port 179 and coming from TCP port 179.

Jan
08

Prior to the support of prefix-lists in the IOS advanced filtering for BGP needed to be done using extended ACLs.  The syntax for using extended ACLs is shown below:

access-list <ACL #> permit ip <network> <wildcard mask of network> <subnet mask> <wildcard mask of subnet mask>

The source portion of the extended ACL is used to match the network portion of the BGP route and the destination portion of the ACL is used to match the subnet mask of the BGP route.  Here are some examples:

access-list 100 permit ip 10.0.0.0 0.0.0.0 255.255.0.0 0.0.0.0
Matches 10.0.0.0/16 - Only

access-list 100 permit ip 10.0.0.0 0.0.0.0 255.255.255.0 0.0.0.0
Matches 10.0.0.0/24 - Only

access-list 100 permit ip 10.1.1.0 0.0.0.0 255.255.255.0 0.0.0.0
Matches 10.1.1.0/24 - Only

access-list 100 permit ip 10.0.0.0 0.0.255.0 255.255.255.0 0.0.0.0
Matches 10.0.X.0/24 - Any number in the 3rd octet of the network with a /24 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.0 255.255.255.0 0.0.0.0
Matches 10.X.X.0/24 - Any number in the 2nd & 3rd octet of the network with a /24 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.240 0.0.0.0
Matches 10.X.X.X/28 - Any number in the 2nd, 3rd & 4th octet of the network with a /28 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.0 0.0.0.255
Matches 10.X.X.X/24 to 10.X.X.X/32 - Any number in the 2nd, 3rd & 4th octet of the network with a /24 to /32 subnet mask.

access-list 100 permit ip 10.0.0.0 0.255.255.255 255.255.255.128 0.0.0.127
Matches 10.X.X.X/25 to 10.X.X.X/32 - Any number in the 2nd, 3rd & 4th octet of the network with a /25 to /32 subnet mask

Subscribe to INE Blog Updates

New Blog Posts!