Tuesday, July 26, 2016

Cisco Intro to QoS and CoS, Part 4 -- Shape Average vs. Bandwidth in Policy Maps

If you've poked around at all with the policy-map statement, you've probably noticed that there are several options for configuring how much bandwidth to allocate to different classes within the QoS config. Let's look at two of the options, shape average and bandwidth.

We'll use the following network...:


...and the following configuration:
ip access-list extended PRIORITY-IN
 permit ip 192.168.1.0 0.0.0.255 any
 permit ip any 192.168.1.0 0.0.0.255
 deny ip any any
ip access-list extended SCAVENGER-IN
 permit ip 192.168.2.0 0.0.0.255 any
 permit ip any 192.168.2.0 0.0.0.255
 deny ip any any
!
class-map match-any PRIORITY
 match dscp af41
class-map match-any SCAVENGER
 match dscp cs1
!
class-map match-any PRIORITY-IN
 match access-group name PRIORITY-IN
class-map match-any SCAVENGER-IN
 match access-group name SCAVENGER-IN
!
policy-map REMARK_ALL
 class PRIORITY-IN
  set dscp af41
 class SCAVENGER-IN
  set dscp cs1
policy-map EGRESS-SHAPER
 class PRIORITY
  shape average percent 70
 class SCAVENGER
  shape average percent 5
policy-map EGRESS-BANDWIDTH
 class PRIORITY
  bandwidth percent 70
 class SCAVENGER
  bandwidth percent 5
!

For my first test, I set up the policy-map, "EGRESS-SHAPER," on interface eth3/0 on both routers, and set up the "REMARK_ALL" service policy on the ingress interfaces (fa0/0 on both routers, and fa1/0 on R1):
interface FastEthernet0/0
 service-policy input REMARK_ALL
!
interface FastEthernet1/0
 service-policy input REMARK_ALL
!
interface Ethernet3/0
 service-policy output EGRESS-SHAPER
!

Then, I set up iperf as a service on host "Knoppix" and ran iperf as a client on both Knoppix Clones, using the flags "-i 1 -t 120," finding that Knoppix Clone 1 could send data at a little over 5Mbps, and Knoppix Clone 2 sent data at about 500Kbps. I then re-ran the test using the service policy "EGRESS-BANDWIDTH," where I saw a transfer speed of just over 4Mbps for Knoppix Clone 1 and about 1 1/3Mbps for Knoppix Clone 2. That surprised me; I didn't really understand why my bandwidth decreased in class PRIORITY (Knoppix Clone 1) and increased in class SCAVENGER (Knoppix Clone 2). To understand what was happening, I did some more tests, using both TCP and UDP transfers ("-u -b 10M" for the flags to set a UDP test at up to 10Mbps), and with both hosts in contention for bandwidth as well as with the hosts transferring data one at a time (i.e., run iperf on Knoppix Host 1, and then running it on Knoppix Clone 2 after the test completed on Knoppix Clone 1). Here are the results from all of the tests:

Congested:
  TCP:
    Using EGRESS-SHAPER:
      192.168.1.x: 5.15Mbps
      192.168.2.x: 0.48Mbps

    Using EGRESS-BANDWIDTH:
      192.168.1.x: 4.16Mbps
      192.168.2.x: 1.36Mbps


  UDP:
    Using EGRESS-SHAPER:
      192.168.1.x: 3.07Mbps
      192.168.2.x: 0.16Mbps

    Using EGRESS-BANDWIDTH:
      192.168.1.x: 2.8Mbps
      192.168.2.x: 1.26Mbps


Uncongested:
  TCP:
    Using EGRESS-SHAPER:
      192.168.1.x: 5.76Mbps
      192.168.2.x: 0.50Mbps

    Using EGRESS-BANDWIDTH:
      192.168.1.x: 5.52Mbps
      192.168.2.x: 5.51Mbps

  UDP:
    Using EGRESS-SHAPER:
      192.168.1.x: 5.82Mbps
      192.168.2.x: 0.47Mbps

    Using EGRESS-BANDWIDTH:
      192.168.1.x: 6.03Mbps
      192.168.2.x: 6.14Mbps

What's happening here is that "shape average" tells the router to strictly allocate the specified amount of bandwidth to the various traffic classes. Even if more bandwidth is available through the interface, the allocated bandwidth is all that the traffic class will receive. When using the "bandwidth" statement, however, the router will allow the traffic classes to use all of the available bandwidth. Since class PRIORITY takes, well, priority over class SCAVENGER, when there is traffic in both traffic classes, class SCAVENGER will only receive the allocated bandwidth. However, when there is little or no priority traffic on the interface, the router will allow class SCAVENGER to "borrow" bandwidth from the other traffic classes in the service policy (in this case, class PRIORITY). Consequently, when there was no contention for bandwidth, class SCAVENGER can get the full data rate on the interface*.

*Q: Wait a minute! An Ethernet port runs at 10Mbps, but these results are showing about 6Mbps. What happened to the other 4Mbps?
A: Keep in mind that an Ethernet port runs at half duplex, meaning that the same copper wire is used for both transmit and receive. Therefore, the 10Mbps bandwidth available on the Ethernet port has to be shared between transmit and receive.**

**Q: Okay, so that explains why it's not 10Mbps, but in that case, shouldn't the data rate be lower, like 5Mbps?
A: If both sides were transmitting data at line rate, then yes. However, in this case the flow of data is mostly one-sided, especially with the UDP traffic, where there is no ACK being sent back to the two Knoppix Clones.***

***Q: In that case, should the data rate be higher?
A: The devil is in the details, and there are a lot of details to consider ;) "Customer" data isn't all that is flowing between R1 and R2. R1 and R2 are also sending their own data back and forth, for example, OSPF updates, CDP updates, etc. Consequently, while the flow of data between R1 and R2 is mostly one-sided, it's not completely one sided. If you REALLY want a good understanding of what's happening, replace R2 with the Knoppix host and run wireshark to capture the packets during a test. I'll leave that as an exercise for the reader ;)

Tuesday, July 5, 2016

Cisco Intro to QoS and CoS, Part 3 -- Classifying Traffic with NBAR

In our QoS labs so far, we've had to find some way to identify traffic, usually by creating an ACL to match packets against some criteria, such as source/destination port, source/destination IP address, or protocol (TCP/UDP/ICMP), etc. However, Cisco offers another way to match traffic: NBAR, or "Network Based Application Recognition. I created the following network in GNS3 to start playing with NBAR:



To enable NBAR on a router, you first need to turn on CEF, and then you need to enable NBAR on any interface that will have a service policy to mark and classify traffic. In this network, I am using an NM-16ESW switchport module in slot 0, so while Fa0/0 is technically the ingress port, I would enable NBAR on my VLAN interface (int VLAN10). If you are using a plain FastEthernet port rather than a switch module, then you would enable NBAR on the Fa port. In either case, here is how to do it:
R4(config)#ip cef
R4(config)#int vlan 10
R4(config-if)#ip nbar protocol-discovery

After enabling NBAR, we'll need to design our QoS schema. Let's start by identifying the types of traffic on our network, and by deciding what traffic will take priority over other traffic. I came up with the following (admittedly hokey) schema, ordered by priority:
  1. EIGRP
  2. HTTP (simulating voice/video with a streaming mp4 file)
  3. Telnet
  4. SNMP
  5. SSH
  6. ICMP


I then mapped this traffic to the following traffic classes:
Traffic DSCP Value
EIGRP CS6
HTTP EF
Telnet AF41
SNMP CS3
SSH AF21
ICMP CS2


Cool! Let's start configuring the class-maps on R4, R5 and R6:
R4(config)#class-map match-all EIGRP
R4(config-cmap)#match protocol eigrp
R4(config-cmap)#class-map match-all HTTP
R4(config-cmap)# match protocol http
R4(config-cmap)#class-map match-all SSH
R4(config-cmap)# match protocol ssh
R4(config-cmap)#class-map match-all TELNET
R4(config-cmap)# match protocol telnet
R4(config-cmap)#class-map match-all SNMP
R4(config-cmap)# match protocol snmp
R4(config-cmap)#class-map match-all ICMP
R4(config-cmap)#match protocol icmp

...and now, the policy-map to use these classes:
R4(config-cmap)#policy-map REMARK_ALL
R4(config-pmap)#description policy-map to place traffic into the appropriate traffic class
R4(config-pmap)# class EIGRP
R4(config-pmap-c)# set ip dscp cs6
R4(config-pmap-c)# class HTTP
R4(config-pmap-c)# set ip dscp ef
R4(config-pmap-c)# class SSH
R4(config-pmap-c)# set ip dscp af21
R4(config-pmap-c)# class TELNET
R4(config-pmap-c)# set ip dscp af41
R4(config-pmap-c)# class SNMP
R4(config-pmap-c)# set ip dscp cs3
R4(config-pmap-c)# class ICMP
R4(config-pmap-c)# set ip dscp cs2

Now, just enable the policy map in your ingress interface (VLAN10 in this lab), and you are done:
R4(config-if)#int vlan10
R4(config-if)# service-policy input REMARK_ALL

Pretty easy, huh? To utilize NBAR to identify traffic from various protocols, all you have to do is use the "match protocol <protocol>" statement inside a class-map and NBAR will identify the traffic for you! Keep in mind, however, that NBAR is deep-packet inspection -- it isn't just a simple match for port number and IP address -- and therefore, this functionality comes at a cost of CPU cycles (unless you are using a more recent model switch/router that off-loads the packet inspection to a separate processor). Consequently, if a simple ACL will meet your needs, it MAY be less resource intensive to use the ACL, as we've done in earlier labs. However, if you have a higher-powered router, NBAR certainly makes identifying and classifying traffic much easier on the network admin.

Note:
This is just a really simple example of NBAR, and doesn't even begin to scratch the surface of what NBAR can do for you. For example, when matching the HTTP protocol, you can create multiple classes based upon the host in the HTTP request and you can even use a regex to pattern-match within the URL. You can also extend NBAR with external files to add support for protocols that are not already built in.

For the sake of completeness, here is the rest of the QoS configuration on R4: R4(config)#class-map match-any PRIORITY
R4(config-cmap)# match ip dscp ef
R4(config-cmap)#class-map match-any CONTROL
R4(config-cmap)# match ip dscp cs6
R4(config-cmap)#class-map match-any CRITICAL
R4(config-cmap)# match ip dscp af41
R4(config-cmap)# match ip dscp cs3
R4(config-cmap)#class-map match-any ROUTINE
R4(config-cmap)# match ip dscp cs2
R4(config-cmap)# match ip dscp af21
R4(config-cmap)#class-map match-any SCAVENGER
R4(config-cmap)# match ip dscp CS1
R4(config)#policy-map EDGE_CHILD
R4(config-pmap)# description Core links
R4(config-pmap)# class PRIORITY
R4(config-pmap-c)# priority percent 25
R4(config-pmap-c)# class CONTROL
R4(config-pmap-c)# bandwidth percent 10
R4(config-pmap-c)# class CRITICAL
R4(config-pmap-c)# bandwidth percent 15
R4(config-pmap-c)# random-detect dscp-based
R4(config-pmap-c)# class ROUTINE
R4(config-pmap-c)# bandwidth percent 10
R4(config-pmap-c)# random-detect dscp-based
R4(config-pmap-c)# class SCAVENGER
R4(config-pmap-c)# bandwidth percent 1
R4(config-pmap-c)# random-detect dscp-based
R4(config-pmap-c)# class class-default
R4(config-pmap-c)# fair-queue
R4(config-pmap-c)# random-detect dscp-based
R4(config-pmap-c)#!
R4(config-pmap-c)#policy-map EDGE2CORE
R4(config-pmap)# description Parent policy for Edge-to-Core links
R4(config-pmap)# class class-default
R4(config-pmap-c)# shape average 10000000
R4(config-pmap-c)# service-policy EDGE_CHILD
R4(config-pmap-c)#!
R4(config-if)#int fa1/0
R4(config-if)# service-policy output EDGE2CORE
R4(config-if)#!
R4(config-if)#int fa3/0
R4(config-if)# service-policy output EDGE2CORE
R4(config-if)#!

Cisco Multicast Lesson 1: Introduction to Multicast through an Easy Example

In a lot of my QoS testing, I have used Gnome M-Player to play an MP4 file, served off of a web server. The cool thing about this approach is that I can easily increase the bandwidth coming across the network by connecting additional clients and using M-Player to download new streams.

In the real world, however, you typically want to conserve bandwidth, and so having multiple clients downloading separate instances of the same data is usually a bad thing. Consider the following network as an example:



In this very simple example, we have a CentOS 6 server (Which will be streaming the file), and three Knoppix clients (which will be playing the file with M-Player). When I start up one of the Knoppix clients and begin a unicast stream, I see 270Kbps through Fa2/0:
R1#sho int fa2/0
...
5 minute output rate 270000 bits/sec, 32 packets/sec
...

(yes, I waited 5 minutes after beginning the stream).

What happens if we clear the interface counters and start up a second video stream from another client (and wait 5 minutes for the average to stabilize)? Will we see 540Kbps (2 streams x 270Kbps per stream)?
R1#sho int fa2/0
...
5 minute output rate 650000 bits/sec, 63 packets/sec
...

Huh...a little over the estimate. What if we fire up a third client downloading the same file?

R1#sho int fa2/0
...
5 minute output rate 907000 bits/sec, 92 packets/sec
...

Again, pretty close to what we would expect (3 x 270Kbps = 810Kbps). I'll chalk up the additional bandwidth to OSPF routing updates, CDP, and other network chatter, but the important thing is that this method of streaming data across a large network doesn't scale well.

Fortunately, there is a solution for this problem. What if there were a way to reduce the number of data streams that each router had to propagate? In a unicast stream, the router opens up a stream of data for each host that connects to the CentOS server. However, with multicast streams, the router only opens up one channel per outbound path. For example, the route from the CentOS server to Knoppix and Knoppix Clone 2 is identical up to R3 (and if Fa1/0 on R1 is shut down, as it was during my testing, it's identical to Knoppix Clone 1, as well). With multicast, R1 would forward a single datastream to R3; R3 would then split that data stream to R5 and to R4. With Fa1/0 shut down on R1, R4 would split its incoming stream also, sending one stream to Knoppix Clone 2 and one stream to R6. Now, suppose that instead of one client on R4, R5 and R6, suppose there were ten...it's not hard to see how multicast can quickly reduce bandwidth consumption on a large network.

So...how do you configure multicast on a Cisco router?

This is actually very easy to do. First, you globally enable multicast on your router:
R1(config)#ip multicast-routing
R1(config)#

Then, you enable PIM on the interfaces that will be participating in multicast streams:
R1(config)#int fa1/0
R1(config-if)#ip pim sparse-dense-mode
R1(config-if)#int fa2/0
R1(config-if)#ip pim sparse-dense-mode
R1(config-if)#int vlan10
R1(config-if)#ip pim sparse-dense-mode

There's a lot more to multicast -- which we'll get to in later labs -- but for a bare-bones configuration, that's it! Pretty easy, isn't it?

At this point, I'm going to chase down a rabbit trail for a few minutes. This is primarily intended as a Cisco lab, but unlike most of the other labs on this blog, this one requires some additional work on the server side in order to test. I started by installing VLC on the CentOS host. For CentOS6, you can install a binary package like this:
# rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
# rpm -Uvh http://li.nux.ro/download/nux/dextop/el6/i386/nux-dextop-release-0-3.el6.nux.noarch.rpm
# yum install vlc

For CentOS7, the install goes like this:
# rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
# rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm
# yum install vlc

Once VLC is installed, you will need to start the multicast stream:
$ cvlc -vvv /var/www/localhost/htdocs/ZNKR_Iaido.mp4 --sout '#rtp{mux=ts,dst=239.255.255.1}' --ttl 12

Then on the Knoppix clients, start up Gnome M-Player (the command line version would probably work, but I used the GUI), click "File," click "Open Location" and in the text window, type:
rtp://@239.255.255.1:5004/

So...what did we just do?

On the CentOS server, we added the repositories that allow you to install VLC through the CentOS package manager. Then, we installed VLC. As a NON-ROOT user (VLC will not run as root), start the command-line version of VLC, telling it to use the RTP protocol, and use the multi-cast address of 239.255.255.1 (a full discussion of multicast addressing is outside the scope of this lab, but for now, feel free to substitute any address in the subnet 239.255.0.0/16).

On the client side, we are again saying to use the RTP protocol, we are referencing the same multicast address, and specifying the port 5004 in the URL.

</rabbit-trail>Back to networking...

Now that we have multicast running, what does that do to our bandwidth usage? Let's find out! Start up two Knoppix clients, start the multicast stream on the CentOS server, clear the counters on FA2/0 on R1, and let's see:

R1#sho int fa2/0
...
5 minute output rate 449000 bits/sec, 43 packets/sec
...

That's much better! There's a little overhead for the multicast protocol itself, but it's still around 30% less than the unicast bandwidth utilization with two clients.

One interesting aspect of multicast is that, from the client's perspective, multicast is much more like broadcast television than video-on-demand. That is, with a unicast stream, when you request a resource, you start at the beginning of the video (or audio or...) stream. With multicast, you are joining a stream already in progress, which means you will be watching (or listening or...) to the exact same portion of the stream as other users.

Look at the two following screenshots. In the first screenshot, the videos are obviously playing at different locations in the video stream:


However, in the second screenshot, you can see that the stream on the right has been playing for 13 minutes and 45 seconds and the one on the left has been playing for only 7 minutes and 50 seconds...but both are playing the exact same point in the video stream:


With that, we'll close out this lab, even though we've only scratched the surface of multicast. More to come!