Over the last few days here at the Circus we have been playing around with trying to test our service level agreements (SLA). It came about because one of our off-campus sites was having connectivity issues and were extremely vocal in their complaints, squeaky wheel and all that.
The problem is that a vendor was blaming their poor performance on the site connectivity. Of course we set up nagios to poll every minute but that wasn’t good enough. We needed to be able to graph response time. Eventually I wrote a perl script to feed data to MRTG but before we did that we played around with IOS rtr and ip sla commands.
I was working on something else when my counterpart began playing with ip sla and rtr, so I decided to lab it even though it is not on the ONT exam.
A short note about IP SLA and responders. Depending upon version number and platform you are able to do different operations. It is interesting that Cisco SLA monitoring is very careful regarding time stamps. This is so that you can truly get line speed as opposed to application or processing delays. From the Cisco documentation, IP SLA test packets use time stamping to minimize the processing delays. When the IP SLA responder is enabled, it allows the target device to take time stamps when the packet arrives on the interface at interrupt level and again just as it is leaving, eliminating the processing time. This time stamping is made with a granularity of sub-milliseconds.
Below is an image of the lab setup I am using. Once again it is the general cabling diagram from the ONT Lab book because I am not changing the wiring until I have to.
Goals for the lab:
- Have R1 download the index page from the web server 192.168.24.234 and report it’s statistics for a 24 hour period under the tag HTTP.234.
- Have R1 report the SLA for tcpConnect for a one hour period to R2.
How I did the lab:
The server at 192.168.24.234 is running a web server and we want to have the IOS HTTP SLA measure performance. Let’s test downloading a file from the server:
R1#copy http://192.168.24.234/index.html null: Loading http://192.168.24.234/index.html 55 bytes copied in 0.060 secs (917 bytes/sec)
Now let’s see what operations sla supports on our router.
R1#sh ip sla monitor application <omitted> Supported Operation Types Type of Operation to Perform: dhcp Type of Operation to Perform: dns Type of Operation to Perform: echo Type of Operation to Perform: frameRelay Type of Operation to Perform: ftp Type of Operation to Perform: http Type of Operation to Perform: jitter Type of Operation to Perform: pathEcho Type of Operation to Perform: pathJitter Type of Operation to Perform: tcpConnect Type of Operation to Perform: udpEcho Type of Operation to Perform: voip
You might as well configure snmp on the router, I used the ubiquitous public community string, I would recommend changing that:
snmp-server community public RO
Now to configure a test of the sla in the lab:
ip sla monitor 1 type http operation get url http://192.168.24.234/index.html tag HTTP.234 ip sla monitor schedule 1 life 86400 start-time now
Notice that when we scheduled it we are only going to run it for a day, 86,400 seconds with a start-time of now. If you wanted to run this test indefinitely you would configure life forever.
Now to show what is going on:
R1#sh ip sla monitor collection-statistics Entry number: 1 Start Time Index: *15:43:14.400 UTC Sun Mar 31 2002 Number of successful operations: 5 Number of operations over threshold: 0 Number of failed operations due to a Disconnect: 0 Number of failed operations due to a Timeout: 0 Number of failed operations due to a Busy: 0 Number of failed operations due to a No Connection: 0 Number of failed operations due to an Internal Error: 0 Number of failed operations due to a Sequence Error: 0 Number of failed operations due to a Verify Error: 0 DNS RTT: 0 TCP Connection RTT: 57 HTTP Transaction RTT: 44 HTTP time to first byte: 86 DNS TimeOut: 0 TCP TimeOut: 0 Transaction TimeOut: 0 DNS Error: 0 TCP Error: 0 Transaction Error: 0
I also wanted to test the IP SLA tcpConnect SLA configuration. Here is the command to set up R2 as the responder:
ip sla monitor responder
And the commands to enable it on R1 as the source of the tcpConnect:
ip sla monitor 2 type tcpConnect dest-ipaddr 192.168.12.2 dest-port 5000 source-ipaddr 192.168.12.1 source-port 5000 timeout 1000 frequency 10 ip sla monitor schedule 2 start-time now
And to confirm that is work on R1:
R1#sh ip sla monitor collection-statistics 2 Entry number: 2 Start Time Index: *10:14:13.723 UTC Mon Apr 1 2002 Number of successful operations: 6 Number of operations over threshold: 0 Number of failed operations due to a Disconnect: 0 Number of failed operations due to a Timeout: 4 Number of failed operations due to a Busy: 0 Number of failed operations due to a No Connection: 0 Number of failed operations due to an Internal Error: 1 Number of failed operations due to a Sequence Error: 0 Number of failed operations due to a Verify Error: 0
Now to confirm that is working on R2:
R2#sh ip sla monitor responder IP SLA Monitor Responder is: Enabled Number of control message received: 93 Number of errors: 0 Recent sources: 192.168.12.1 [01:21:55.972 UTC Fri Mar 29 2002] 192.168.12.1 [01:21:45.968 UTC Fri Mar 29 2002] 192.168.12.1 [01:21:35.972 UTC Fri Mar 29 2002] 192.168.12.1 [01:21:25.972 UTC Fri Mar 29 2002] 192.168.12.1 [01:21:15.968 UTC Fri Mar 29 2002] Recent error sources:
You can find the SNMP Object Navigator here where you can look up Cisco MIBs.
This is how you would download snmp data from your router:
# snmpwalk -v 2c -c public 192.168.12.1 1.3.6.1.4.1.9.9.42.1.3.4.1.11.1 SNMPv2-SMI::enterprises.9.9.42.1.3.4.1.11.1.104057532 = Counter32: 329
And to make it more MRTG friendly:
# snmpwalk -v 2c -c public 192.168.12.1 1.3.6.1.4.1.9.9.42.1.3.4.1.11.1 | cut -d \: -f 4 | sed -e 's/ //g' 357
I used the IP SLA documentation to help me configure SLA, it is also the source of the quote above.