Nov
04

NBAR protocol classification feature has long supported enhanced HTTP URL matching features. However, Cisco documentation site never provided a detailed description of the pattern language used for URL matching; neither has it explained how the engine matches client/server data streams. In this post we will give an overview of how NBAR works with URL filtering.

We will arrange this post in a FAQ manner as follows.

Q1: What is the syntax for matching the URLs?
A1: The syntax resembles regular expressions but it is actually not. Rather, it is more similar to using globs or wildcard special characters. The pattern you configured is matched against the string found after the “GET”, “POST” or “PUT” method in HTTP request packet. Note that NBAR is smart enough to remove the leading “/” in the file path. However, the HTTP request must end up with “\r\n” or the NBAR StILE (stateful inspection language engine) will not recognize it (an easy way to fool the inspection engine)!

Here is the list of the available wildcards:

“*” – match any pattern e.g. “aaa”, “abcd1234”. To match the substring “xyz” in the beginning of a string use the pattern “xyz*”. To match “xyz” anywhere in the string use the pattern “*xyz*”. Use the pattern “*xyz” to match the substring in the end. Note that pattern “xyz” matches only the exact string “xyz”. You may also use complicated patterns “ab*cd*ef” at the expense of some CPU penalty probably.
“?” – match any single character. For example pattern “???” matches “xyz”, “abc”,”efg”,”123”. You can mix “*” and “?” like in pattern “tes*.????”
”[]” – match range of characters. E.g. “[abc]” will match any single character “a”, “b” or “c”.
”|” – alternative. Separate patterns with “|” in order to specify “OR” matching logic. For example “xyz|abc” matches either full “xyz” or “abc” strings. You may mix “|” with other globs like this “*xyz*|*abc*|*pqr*”. Note sure about the overall limit of using the “|” symbol but you’d better keep it to minimum, in order to make matching faster.
“()” – grouping. Denotes the boundaries of a sub-pattern. For example instead of “*.txt|*.bin” you can write “*.(txt|bin)”.

All matching is case insensitive. So the pattern “text” matches “TEXT” as well. The engine matches your URL pattern against the directory path and the file name in the URL. E.g. If the URL is “http://www.cisco.com/pub/uploads/image.jpeg” the URL matching will only use the “pub/uploads/image.jpeg” part of the URL. As a matter of fact, when you submit request like the above URL, it translates into the following headers (there are actually more, but this is the bare minimum):


GET /pub/uploads/image.jpeg HTTP/1.1
Host: www.cisco.com

Q2: What if I want to match the host name used in the URL?
A2: You need to match the “Host” header field then. Use the match protocol http host [pattern]. You can use the same glob patterns for matching that you use to matching the filename in the URL.

Here is a good example: Using NBAR for application filtering

Q3: How NBAR actually classifies the traffic flows?
A3: When you apply a policy-map that contains a class with “match protocol” statement, the system starts NBAR classification engine on the interface. Any packet, be it ingress or ingress, passes the NBAR inspection engine provided that it passes the basic filters like matching the port number assigned to the protocol. You can change the port map using the global command ip nbar port-map [protocol].

When the engine sees a TCP SYN packet for the matching session, it starts the internal state machine, trying to parse the packet flow. Every new packet in the flow (in any direction) is inspected. Note that the NBAR does not classify a flow instantly. It may take some packet exchange until the engine determines that the flow matches specified criteria. As soon as there is enough information about the flow to classify it, the engine “tags” the bi-directional flow with the corresponding class value and reports this decision to the policy-map. Note that this classification applies in both directions – that is to the packets belonging to the flow and heading either ingress or egress.

For example, if a user starts a web sessions ands opens an URL matching any of your NBAR criteria, the engine will classify the flow as soon as it sees the packet with the URL string. After this, both packet flows from the client to server or from the server to the client would the respective class – that is, the policy map could be applied in either direction of the interface.

The engine will remove the “tag” when it faces the flow “end” criteria: e.g. it catches a TCP FIN or TCP RST flag. After this, flow is no longer monitored by NBAR and reported as matching the respective class.

Consider the example below. In this scenario, Fa0/0 is the outside interface, facing the internet. User’s traffic flows out of this interface.


class-map match-all TEST
 match protocol http url "*.(t?xt|ocx|ex[ea])"
!
policy-map TEST
 class TEST
   drop
!
interface FastEthernet0/0
 ip address 155.1.37.3 255.255.255.0
 service-policy input TEST

However, as soon as user opens the URL matching the class-map specification, the engine will classify the flow as matching the class “TEST”. After this, all returning packets (server to client) for this flow will be dropped by the policy map.

Note that even though this works, it is not the recommended way of applying the URL filtering. This is because the actual GET requests will still get to the destination server and generate response traffic. In some situations, the client may send a FIN packet when server generates the response. At this point, NBAR will report the flow as destroyed and the policy-map will not match the packets. If the server response is slow enough, it will actually bypass the ingress filter! Thus, it is always recommended to apply the NBAR classification and policy action in the direction matching the direction of the protocol commands (e.g. apply the URL filtering in the direction of GET commands).

Q4: Are there any other matching criteria besides matching the URL or Host name?
A4: Yes, you can match a number of client/server HTTP headers. Better then that, you can match content-type in server replies, e.g.

match protocol http mime text/plain.

However this type of matching relies on the server to return the proper MIME encoding type in response using the “Content-Type” header. The benefit is the ability to block a whole class of files (e.g. image/jpeg) with a single match criterion.

Q5: Is NBAR good for production deployment?
A5: No. In fact, NBAR is OK if you need some quick and dirty hack, but it is not to be used as a content filtering engine. There is a bunch of reasons:

1) NBAR inspection consumes too much router CPU resources.
2) NBAR inspection is easy to trick. For example by simply avoiding “\r\n” in the end of the string you would make the engine believe it is not an URL request.
3) NBAR engine is buggy. Sometimes you may found your class matches much more traffic than you wanted and drops really important packets. Ooops! This happens all the times.
4) NBAR does not perform true TCP stream reassembly, removing duplicates and assembling fragments. This makes it totally vulnerable to any kind of tricky TCP/IP attacks.

So if you want some real content security, use the specialized solutions :)

About Petr Lapukhov, 4xCCIE/CCDE:

Petr Lapukhov's career in IT begain in 1988 with a focus on computer programming, and progressed into networking with his first exposure to Novell NetWare in 1991. Initially involved with Kazan State University's campus network support and UNIX system administration, he went through the path of becoming a networking consultant, taking part in many network deployment projects. Petr currently has over 12 years of experience working in the Cisco networking field, and is the only person in the world to have obtained four CCIEs in under two years, passing each on his first attempt. Petr is an exceptional case in that he has been working with all of the technologies covered in his four CCIE tracks (R&S, Security, SP, and Voice) on a daily basis for many years. When not actively teaching classes, developing self-paced products, studying for the CCDE Practical & the CCIE Storage Lab Exam, and completing his PhD in Applied Mathematics.

Find all posts by Petr Lapukhov, 4xCCIE/CCDE | Visit Website


You can leave a response, or trackback from your own site.

6 Responses to “Using NBAR for HTTP URL filtering”

 
  1. apep says:

    Hello Petr,

    Another outstanding article :)

    At the moment NBAR is supporting several peer-to-peer applications, but how can it be configured to recognize the new versions of these protocols, or other currently not supported ones? These p2p applications are usually opening a lot of TCP and/or UDP ports, so the “ip nbar custom” command seems to be not capable due to its limitation. Even using the offset (byte location), format (ascii, decimal, hex) or variable options, the port numbers must be explicitly configured, what is not really feasible, and can conflict with the predefined protocols. Can be a PDLM file created or updated without involving Cisco?

  2. Unfortunately, Cisco never openly documented the PDLM format (StILE program), nor any other StILE details. So far, you can only use PDLM files downloaded from Cisco’s site, and those are not getting updated too often…

  3. Ahriakin says:

    Excellent article. I don’t respond to each and every one of these but I greatly appreciate them, many thanks.

    Derek.

  4. Sesano says:

    Excellent and detail explanation coming in at the right time for me as this I am about to review this into detail.

    But one more thing, what if I want to match video traffic in the http requests ?

    How do we handle this ?

  5. Haha ^^ nice, is there a section to follow the RSS feed

  6. John jinesh says:

    it worked on my router………

    is a beautiful article and helped me lot, i am new to QoS on the way to ccie ,thanks for your valuable informations..

 

Leave a Reply

Categories

CCIE Bloggers