Fighting Multipath

2013-11-19 08:44:59

[root@chevelle ~]# cat /etc/multipath/bindings
# Multipath bindings, Version : 1.0
# NOTE: this file is automatically maintained by the multipath program.
# You should not need to edit this file in normal circumstances.
#
# Format:
# alias wwid
#
mpath0 36a4badb021d20600133389a784a85226
mpath1 36a4badb000291e140000064f3aa78999
mpath2 36a4badb0002b75c6000006334bea77a3
mpath3 36a4badb000291e14000006983aa7b02e
mpath4 36a4badb0002b75c6000006364bea78a5
mpath5 36a4badb021d32c00132d9598938212dc
mpath6 36a4badb000291e1400001704411a6737
mpath7 36a4badb000291e1400001700411a636d
mpath8 36a4badb000291e1400001706411a67a7
mpath9 36a4badb000291e1400001702411a6421
[root@chevelle ~]# multipath -ll
mpath9 (36a4badb000291e1400001702411a6421) dm-4 DELL,MD3000
[size=136G][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:3 sdj 8:144 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:3 sde 8:64 [active][ghost]
mpath8 (36a4badb000291e1400001706411a67a7) dm-3 DELL,MD3000
[size=136G][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:2 sdi 8:128 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:2 sdd 8:48 [active][ghost]
mpath7 (36a4badb000291e1400001700411a636d) dm-2 DELL,MD3000
[size=10M][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdc 8:32 [active][ghost]
mpath6 (36a4badb000291e1400001704411a6737) dm-1 DELL,MD3000
[size=10M][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:0 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:0 sdb 8:16 [active][ghost]
[root@chevelle ~]# man mkqdisk
[root@chevelle ~]# mkqdisk -L
mkqdisk v0.6.0
/dev/dm-5:
/dev/mapper/mpath6p1:
/dev/mpath/mpath6p1:
Magic: eb7a62c2
Label: qdisk
Created: Thu Jun 3 18:40:33 2010
Host: chevelle
Kernel Sector Size: 512
Recorded Sector Size: 512

[root@chevelle ~]# fdisk -l

Disk /dev/sda: 146.1 GB, 146163105792 bytes
255 heads, 63 sectors/track, 17769 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 128 1020127+ 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 128 651 4200997+ 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sda3 651 17769 137500335 8e Linux LVM

Disk /dev/sdf: 20 MB, 20971520 bytes
64 heads, 32 sectors/track, 20 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdf doesn’t contain a valid partition table

Disk /dev/sdg: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdg1 1 1 8001 83 Linux

Disk /dev/sdh: 10 MB, 10485760 bytes
64 heads, 32 sectors/track, 10 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdh doesn’t contain a valid partition table

Disk /dev/sdi: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdi doesn’t contain a valid partition table

Disk /dev/sdj: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdj doesn’t contain a valid partition table

Disk /dev/sdk: 20 MB, 20971520 bytes
64 heads, 32 sectors/track, 20 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdk doesn’t contain a valid partition table

Disk /dev/dm-1: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/dm-1p1 1 1 8001 83 Linux

Disk /dev/dm-2: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-2 doesn’t contain a valid partition table

Disk /dev/dm-3: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-3 doesn’t contain a valid partition table

Disk /dev/dm-4: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-4 doesn’t contain a valid partition table

Disk /dev/dm-5: 8 MB, 8193024 bytes
255 heads, 63 sectors/track, 0 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-5 doesn’t contain a valid partition table
[root@chevelle ~]# mount
/dev/mapper/vg00-root on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/mapper/vg00-var on /var type ext3 (rw)
/dev/mapper/vg00-usr on /usr type ext3 (rw)
/dev/mapper/vg00-usrlocal on /usr/local type ext3 (rw)
/dev/mapper/vg00-home on /home type ext3 (rw)
/dev/mapper/vg00-opt on /opt type ext3 (rw)
/dev/mapper/vg00-tmp on /tmp type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/mapper/vg00-lvpatrol on /patrol type ext3 (rw)
/dev/mapper/vg00-clusterlv on /Cluster_Scripts type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
//172.22.225.73/nwapex/data on /opt/nwapex type cifs (rw,mand)
//172.22.100.127/mirth on /opt/mirth type cifs (rw,mand)
//172.22.111.87/data on /opt/bn1 type cifs (rw,mand)
//172.22.225.130/kronos/InterfaceDesigner/Interface Source Files on /opt/kronos type cifs (rw,mand)
//172.22.41.201/Company on /opt/proscript type cifs (rw,mand)
//172.22.100.244/ASD on /opt/murphy type cifs (rw,mand)
//172.22.100.252/StarData on /opt/epsi type cifs (rw,mand)
nfsd on /proc/fs/nfsd type nfsd (rw)
none on /sys/kernel/config type configfs (rw)
/dev/mapper/hbovg-hbo on /hbo type ext3 (rw)
/dev/mapper/hbovg-hboc on /hboc type ext3 (rw)
/dev/mapper/hbovg-mis on /mis type ext3 (rw)
/dev/mapper/hbovg-temphbo on /temphbo type ext3 (rw)
[root@chevelle ~]# ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 8ebe73bc-0939-401d-b3e4-1d193e433abe -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-
by-id/ by-label/ by-path/ by-uuid/
[root@chevelle ~]# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:08:08.0-sas-0x50026b9139522b00:4:0-0x5a4badb42b75c60c:0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0 -> ../../sdg
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0-part1 -> ../../sdg1
[root@chevelle ~]# ls -l /dev/disk/by-
by-id/ by-label/ by-path/ by-uuid/
[root@chevelle ~]# ls -l /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 boot -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001700411a636d -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001702411a6421 -> ../../sde
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001704411a6737 -> ../../sdb
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb000291e1400001704411a6737-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001706411a67a7 -> ../../sdd
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb0002b75c6000015fa525d18aa -> ../../sdf
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part3 -> ../../sda3
[root@chevelle ~]# ls -l /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 boot -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:08:08.0-sas-0x50026b9139522b00:4:0-0x5a4badb42b75c60c:0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0 -> ../../sdg
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0-part1 -> ../../sdg1
[root@chevelle ~]# ls -l /dev/disk/by-uuid/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 8ebe73bc-0939-401d-b3e4-1d193e433abe -> ../../sda1
[root@chevelle ~]# ls /dev/mapper/
control hbovg-h1686n1.v01a hbovg-h1686n1.v06a hbovg-h1686n1.v11a hbovg-hbo mpath6p1 vg00-home vg00-usr
hbovg-h1686n1.bila hbovg-h1686n1.v02a hbovg-h1686n1.v07a hbovg-h1686n1.v12a hbovg-hboc mpath7 vg00-lvpatrol vg00-usrlocal
hbovg-h1686n1.jn1a hbovg-h1686n1.v03a hbovg-h1686n1.v08a hbovg-h1686n1.v13a hbovg-mis mpath8 vg00-opt vg00-var
hbovg-h1686n1.jn2a hbovg-h1686n1.v04a hbovg-h1686n1.v09a hbovg-h1686n1.v14a hbovg-temphbo mpath9 vg00-root
hbovg-h1686n1.v00a hbovg-h1686n1.v05a hbovg-h1686n1.v10a hbovg-h1686n1.v15a mpath6 vg00-clusterlv vg00-tmp
[root@chevelle ~]# blkid
/dev/mapper/vg00-tmp: LABEL=”/tmp” UUID=”7f389f25-cd20-4b24-ac68-04e9af0ebd04″ TYPE=”ext3″
/dev/mapper/vg00-opt: LABEL=”/opt” UUID=”97e5e00c-0ac4-4821-8248-1cba50920e9b” TYPE=”ext3″
/dev/mapper/vg00-home: LABEL=”/home” UUID=”849f2a65-05a6-42b1-af8c-7ead1b33fb9f” TYPE=”ext3″
/dev/mapper/vg00-usrlocal: LABEL=”/usr/local” UUID=”d0c46c4c-2a2c-4c4b-a1b8-d8a83498b5d9″ TYPE=”ext3″
/dev/mapper/vg00-usr: LABEL=”/usr” UUID=”64887754-c0bc-442b-9b48-f785aa5a0c5c” TYPE=”ext3″
/dev/mapper/vg00-var: LABEL=”/var” UUID=”fed3d412-6b77-4014-b8a9-17471c922399″ TYPE=”ext3″
/dev/mapper/vg00-root: LABEL=”/” UUID=”19e1edd4-ac66-4c2e-8c26-2a8555539b65″ TYPE=”ext3″
/dev/sda2: TYPE=”swap”
/dev/sda1: LABEL=”/boot” UUID=”8ebe73bc-0939-401d-b3e4-1d193e433abe” TYPE=”ext3″
/dev/vg00/root: UUID=”19e1edd4-ac66-4c2e-8c26-2a8555539b65″ TYPE=”ext3″ LABEL=”/”
/dev/scd0: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/mapper/vg00-clusterlv: UUID=”485aea89-b699-49d5-8d87-bae50f80c9e7″ TYPE=”ext3″
/dev/mapper/vg00-lvpatrol: LABEL=”/patrol” UUID=”86ff22dd-d895-4a2f-beca-f3cc8b5e7bd0″ TYPE=”ext3″
/dev/dvd: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/sr0: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/mapper/hbovg-hbo: UUID=”cf90e0da-30b7-41b6-a73d-5e39bdddd013″ TYPE=”ext3″
/dev/mapper/hbovg-hboc: UUID=”43cade5f-ab81-4a7d-b134-ad0917c999e3″ TYPE=”ext3″
/dev/mapper/hbovg-mis: UUID=”14265200-9022-496b-bc6f-8ae3e00c3f13″ TYPE=”ext3″
/dev/mapper/hbovg-temphbo: UUID=”8f1af44c-2207-49ed-91f9-c44283794713″ TYPE=”ext3″
[root@chevelle ~]#

Posted in Linux | Leave a comment

Upgrading ISC Bind and DHCP

Our secondary DNS and DHCP server died last Sunday. Besides some people noticing some services were slower on the network it was a non-event, and that is a good thing. Rather than just doing a restore of the old server, we decided to go ahead and upgrade the OS to the latest version of Red Hat and DHCP and DNS to whatever was supported on that Red Hat version. I realize that is the easy way out, but we used to run a hand compiled version and I just did not see the advantage. I am going to take the time to document the upgrade process for those planning their upgrade.

After NS2 died the primary DHCP server started to run out of leases because the peer held all of the free leases so we told the primary that it’s peer was down. Make sure you have an omapi port defined in your dhcpd.conf file:

# This is for omshell
omapi-port 7911

From this site we got the basics for the following script:

omshell << EOF
connect
new failover-state
set name = "dhcp-failover"
open
set local-state = 2
update
EOF

Here are the options for setting fail over state in omshell:

 
/* A failover peer's running state. */
enum failover_state {
unknown_state			=  0, /* XXX: Not a standard state. */
startup				=  1,
normal				=  2,
communications_interrupted	=  3,
partner_down			=  4,
potential_conflict		=  5,
recover				=  6,
paused				=  7,
shut_down			=  8,
recover_done			=  9,
resolution_interrupted		= 10,
conflict_done			= 11,

Here are all of the DNS/DHCP servers we built for the upgrade:
NS1 — Primary server that needed to be upgraded, physical machine.
NS2 — Secondary server, DOA physical machine.
NS3 — Temporary secondary server, virtual machine.
NS4 — New primary DNS/DHCP server, physical machine.
NS5 — New test primary DNS/DHCP server, virtual machine.
NS6 — New test secondary DNS/DHCP server, virtual machine.

The plan was to test the upgrade process on NS5 and NS6 while one of the other team members built NS4. This may look like overkill but let me explain the rationale behind each server. After the failure of NS2, the first thing we did was stand up a third DNS server, NS3, as a new secondary so that we had a live copy of all of our zones should something happen to our primary DNS. Initially we turned on DHCP for this server as well but because the versions of failover protocol were differed between the servers, we just left DNS running. The failover protocols between versions 3.0 and 3.1 are different enough that they are not compatible. This server was not actually being queried by end users but was there as a failsafe option should we need one. It has been left running as an immediate option for the future.

Once we got a secondary server that would maintain current state we started building servers for the upgrade process. NS4 would eventually become the new primary DNS/DHCP server and is a physical machine. When it was brought online it was first a secondary server to to NS1 so that it had a complete DNS database, then promote it to the new NS1. Because a physical machine takes so much longer to build we spun up NS5 and NS6 as test servers quickly. The plan was to test on NS5 and NS6, promote NS6 to be the new NS2 and convert NS4 to the new NS1. The reason we didn’t just build and move was because we did not want to have to change our IP helper addresses throughout our network.

Here is a step-by-step outline of the actual go live.

1. Build NS4 as secondary DNS server to NS1 so that it has a copy of the DNS database and we don’t have to copy files from NS1.

2. Secure shell into each of the servers to be worked on during this time.
ssh into ns1 on the backp NIC.
ssh into ns2 on the backup NIC.
ssh into ns4 on the backup NIC.
We have a dedicated network for backup traffic, I got into the backup NIC so that I could manipulate the primary addresses without losing connectivity to the servers.

3. Stop DHCP on NS1 and copy the lease data base to the other servers.
service dhcpd stop
scp /var/state/dhcp/dhcpd.ad.leases root@ns2.chainringcircus.org:/var/state/dhcp/dhcpd.leases
scp /var/state/dhcp/dhcpd.ad.leases root@ns4.chainringcircus.org:/var/state/dhcp/dhcpd.leases

3. Shut the interfaces on NS1 before taking down DNS.
ifconfig eth0 down
ifconfig eth1 down

4. Start DHCP on NS2 so that we don’t have too many problems.
service dhcpd start

5. Shut down DNS on NS1
rndc freeze — Make sure there are no .jnl files left.
service named stop

6. Convert NS4 to NS1, we left NS1 up for now in case we needed to copy files or bring this server back online.
Change /etc/sysconfig/network to be ns1.chainringcircus.org

Change the addresses from NS4 to NS1
cp ~/DNS-Primary/ifcfg-eth0 /etc/sysconfig/network-scripts/
cp ~/DNS-Primary/ifcfg-eth1 /etc/sysconfig/network-scripts/
cp ~/DNS-Primary/named.conf.primary /etc/named.conf

7. We tested to make sure everything was running correctly and then rebooted the NS1 to make sure it came up correctly.
shutdown -r now

8. Shut down NS1 for the last time.
shutdown -h now

 

Posted in Linux | Leave a comment

Storage Pod

At the Circus we just built our first Backblaze storage pod and I would like to take the time to document it. We rebuilt the server a number of times for testing and verification with different numbers of disks so output may differ throughout this post.

The cost per terabyte is right up our alley as we are a non-profit hospital. We tried to set ours up as a Windows server so it would have direct attached storage but changed direction and decided to make it a Linux based iSCSI target.

Disk Mapping
The first problem is mapping out the port multiplier backplanes. If you follow this link it shows the way the pod is supposed to be built, however, our drives did not map out accordingly. We took the time to map out our drives by literally shutting down, pulling a disk and turning the server back on to find the layout. If you don’t take the time to do this, I feel for you when a disk dies and you try to figure out how to replace it.

Boot Drives.
sd 0:0:0:0: [sda]
sd 1:0:0:0: [sdb]

First row from right.
sd 7:0:0:0: [sdh]
sd 7:1:0:0: [sdi]
sd 7:2:0:0: [sdj]
sd 7:3:0:0: [sdk]
sd 7:4:0:0: [sdl]

sd 6:0:0:0: [sdc]
sd 6:1:0:0: [sdd]
sd 6:2:0:0: [sde]
sd 6:3:0:0: [sdf]
sd 6:4:0:0: [sdg]

sd 8:0:0:0: [sdm]
sd 8:1:0:0: [sdn]
sd 8:2:0:0: [sdo]
sd 8:3:0:0: [sdp]
sd 8:4:0:0: [sdq

Second row from right.
sd 11:0:0:0: [sdw]
sd 11:1:0:0: [sdx]
sd 11:2:0:0: [sdy]
sd 11:3:0:0: [sdz]
sd 11:4:0:0: [sdaa]

sd 10:0:0:0: [sdr]
sd 10:1:0:0: [sds]
sd 10:2:0:0: [sdt]
sd 10:3:0:0: [sdu]
sd 10:4:0:0: [sdv]

sd 12:0:0:0: [sdab]
sd 12:1:0:0: [sdac]
sd 12:2:0:0: [sdad]
sd 12:3:0:0: [sdae]
sd 12:4:0:0: [sdaf]

Third row from right.
sd 14:0:0:0: [sdag]
sd 14:1:0:0: [sdah]
sd 14:2:0:0: [sdai]
sd 14:3:0:0: [sdaj]
sd 14:4:0:0: [sdak]

sd 15:0:0:0: [sdal]
sd 15:1:0:0: [sdam]
sd 15:2:0:0: [sdan]
sd 15:3:0:0: [sdao]
sd 15:4:0:0: [sdap]

sd 16:0:0:0: [sdaq]
sd 16:1:0:0: [sdar]
sd 16:2:0:0: [sdas]
sd 16:3:0:0: [sdat]
sd 16:4:0:0: [sdau]

Disk Setup
The next problem you have is that fdisk will not handle partitions larger than 2TB, parted to the rescue. Because there were forty-five 4TB disks in the server I did not want to have to do it manually. The other problem was that we had also tested the server as a Windows server so it already had partitions on the disks. As a result we had to remove the old partitions, then create a new one. Luckily you can script parted. Please note that parts of the script are commented out because we ran the script multiple times for different setups.

for I in `dmesg|grep ^sd|cut -d \  -f 1,2,3|grep -v Attach |sort -u | cut -d [ -f 2 | cut -d ] -f 1 `; do echo /dev/${I}\ ; done >>devices-list.txt

cat /usr/local/bin/parted-script.sh 
#!/bin/sh
for i in `cat devices-list.txt`
do
# delete previous partitions
#parted $i --script -- rm 1
#parted $i --script -- rm 2
#parted $i --script -- rm 3

# create partition to take whole disk
parted $i --script -- mkpart primary ext4 1 -1

# set type lvm for jbod
# parted $i --script -- set 1 lvm on

# set type RAID for RAID 6.
parted $i --script -- set 1 raid on

parted $i --script print
done

Create the RAID
The first time through we made all of the disks a JBOD to play, but long term that did not make sense. As a result I am only going to document creating a RAID 6 iSCSI target for Windows servers as this is the purpose of our storage pod.

I try not to do many tasks manually, so here is the work around for trying not to have type 45 disk names.

dmesg|grep ^sd|cut -d \  -f 1,2,3|grep -v Attach |sort -u | cut -d [ -f 2 | cut -d ] -f 1 >>devices.txt
for I in `cat devices.txt`; do  echo -n /dev/${I}1\ ; done >devices1.txt

This creates a file with all of the disk names.

cat devices1.txt 
/dev/sda1 /dev/sdc1 /dev/sdb1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 /dev/sdp1 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1 /dev/sdv1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdac1 /dev/sdad1 /dev/sdae1 /dev/sdaf1 

Create the different software RAID configurations. I created three RAID devices, md0, md1 and md2.

This mdadm command creates a RAID 6 container with 14 physical disks and one spare. We were being cautious with our data.

mdadm --create --verbose /dev/md1 --level=6 --chunk=512 --raid-devices=14 --spare-devices=1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1 /dev/sdv1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1

This mdadm command create a RAID 6 container with all 15 physical disks, I used this configuration for testing the throughput later.


mdadm --create --verbose /dev/md0 --level=6 --chunk=512 --raid-devices=15 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdh1 /dev/sdg1 /dev/sdi1 /dev/sdk1 /dev/sdj1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 /dev/sdp1 /dev/sdq1 

mdadm --create --verbose /dev/md1 --level=6 --chunk=512 --raid-devices=15 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1 /dev/sdv1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdad1 /dev/sdac1 /dev/sdae1 

/mdadm --create --verbose /dev/md2 --level=6 --chunk=512 --raid-devices=15 /dev/sdag1 /dev/sdah1 /dev/sdai1 /dev/sdaj1 /dev/sdak1 /dev/sdal1 /dev/sdam1 /dev/sdan1 /dev/sdao1 /dev/sdap1 /dev/sdaq1 /dev/sdar1 /dev/sdas1 /dev/sdat1 /dev/sdau1 

If you are truly just building an iSCSI target the next steps are pointless. I wanted to do a throughput test so I had to lay down a file system, but once again there were problems. There is a 16TB size limit with mke2fs that ships with RedHat, as a result you need to build a newer version of e2fsprogs.

git clone git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git
cd e2fsprogs
mkdir build ; cd build/
../configure
make
make install

mke2fs -O 64bit,has_journal,extents,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize -i 4194304 /dev/md0
mke2fs 1.43-WIP (22-Sep-2012)

Warning: the fs_type huge is not defined in mke2fs.conf

Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
11446336 inodes, 11721045504 blocks
586052275 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=13870563328
357698 block groups
32768 blocks per group, 32768 fragments per group
32 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848, 512000000, 550731776, 644972544, 1934917632, 
	2560000000, 3855122432, 5804752896

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done         

Next is mount it up and test.

mount -t ext4 /dev/md0 /backup0

mount
/dev/mapper/vg_leroy-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/mapper/ddf1_Rootp1 on /boot type ext4 (rw)
/dev/mapper/vg_leroy-lv_home on /home type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/md0 on /backup0 type ext4 (rw)

watch cat /proc/mdstat 

Every 2.0s: cat /proc/mdstat                                                                                                                                                              Tue Nov 13 14:50:58 2012
md2 : active raid6 sdau1[14] sdat1[13] sdas1[12] sdar1[11] sdaq1[10] sdap1[9] sdao1[8] sdan1[7] sdam1[6] sdal1[5] sdak1[4] sdaj1[3] sdai1[2] sdah1[1] sdag1[0]
      50791197184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
      [>....................]  resync =  0.0% (72704/3907015168) finish=5372.5min speed=12117K/sec
      
md1 : active raid6 sdaf1[14] sdae1[13] sdac1[12] sdad1[11] sdab1[10] sdaa1[9] sdz1[8] sdy1[7] sdx1[6] sdw1[5] sdv1[4] sdu1[3] sdt1[2] sds1[1] sdr1[0]
      50791197184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
      [>....................]  resync =  0.0% (2583680/3907015168) finish=4776.4min speed=13623K/sec
      
md0 : active raid6 sdq1[14] sdp1[13] sdo1[12] sdn1[11] sdm1[10] sdl1[9] sdj1[8] sdk1[7] sdi1[6] sdg1[5] sdh1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0]
      50791197184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [15/15] [UUUUUUUUUUUUUUU]
      [>....................]  resync =  0.0% (3255892/3907015168) finish=5886.7min speed=11052K/sec

Finally you need to save the software raid configuration.

mdadm --detail --scan >> /etc/mdadm.conf

Testing
I wanted to try a throughput test so I copied a CD over to the server. We just weren’t getting enough throughput with the reads and writes so I decided to create a ramdisk, read from it and write to the filesystem.

Create the ramdisk.

ls -alh /dev/ram*
mknod -m 660 /dev/ramdisk b 1 1
chown root.disk /dev/ramdisk
dd if=/dev/zero of=/dev/ramdisk bs=1k count=4194304
/sbin/mkfs -t ext2 -m 0 /dev/ramdisk 16384
mkdir /ramdisk
mount -t ext2 /dev/ramdisk /ramdisk
dd if=/dev/urandom of=/ramdisk/file.txt bs=1k count=15k
ls -alh /ramdisk/

Now copy the 15mb file from the ramdisk 500,000 times. I ran this script for /backup0, /backup1 and /backup2.

for i in `jot -s 1 -e 500000`; do  cp /ramdisk/file.txt /backup0/test0-${i}; done

And the test output, in one minute we had written 12MB.

date && df -h
Sat Nov 10 16:29:49 CST 2012
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_leroy-lv_root
                       50G  3.2G   44G   7% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/mapper/ddf1_Rootp1
                      485M   37M  423M   8% /boot
/dev/mapper/vg_leroy-lv_home
                      236G  188M  224G   1% /home
/dev/md0               48T   78G   45T   1% /backup0
/dev/md1               48T   84G   45T   1% /backup1
/dev/md2               48T   78G   45T   1% /backup2
/dev/ramdisk           16M   16M  302K  99% /ramdisk


date && df -h
Sat Nov 10 16:30:49 CST 2012
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_leroy-lv_root
                       50G  3.2G   44G   7% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/mapper/ddf1_Rootp1
                      485M   37M  423M   8% /boot
/dev/mapper/vg_leroy-lv_home
                      236G  188M  224G   1% /home
/dev/md0               48T   82G   45T   1% /backup0
/dev/md1               48T   88G   45T   1% /backup1
/dev/md2               48T   82G   45T   1% /backup2
/dev/ramdisk           16M   16M  302K  99% /ramdisk

And the IOSTAT command while it was writing.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.22    0.00    6.40   55.14    0.00   38.23

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.18        73.97       117.05    6527092   10328224
sdb               0.79         0.06       117.05       5536   10328224
sdc              87.69       882.70     82847.18   77890479 7310556800
sdd              86.17       715.18     82822.85   63108394 7308410360
sde              11.11       714.80      6402.75   63074967  564987674
sdf               4.92       714.24       135.85   63025858   11987866
sdh               5.16       714.32       134.02   63032209   11825930
sdg               5.10       714.65       135.50   63062197   11956714
sdi               4.98       714.30       133.72   63030809   11799450
sdk               4.92       714.45       133.54   63044265   11784026
sdj               4.95       714.24       133.88   63025249   11813618
sdl               5.02       714.18       134.05   63020313   11828514
sdm               5.08       714.06       133.96   63009609   11821122
sdn               5.00       714.15       133.74   63017213   11801082
sdo               4.97       714.16       133.85   63018757   11811058
sdp               4.62       714.38       130.34   63038351   11501106
sdq               4.59       714.04       128.39   63007996   11329122
sdr               4.82       784.45        45.74   69221236    4036394
sds               4.78       784.46        46.01   69221608    4060330
sdt               4.79       784.55        47.90   69229964    4226866
sdu               4.79       784.75        46.05   69247648    4063482
sdv               4.75       784.68        45.86   69241532    4046386
sdw               4.81       784.77        45.80   69249556    4041530
sdx               4.78       784.75        45.90   69247718    4050298
sdy               4.76       784.88        45.84   69259062    4045058
sdz               4.77       784.79        45.92   69251000    4052066
sdaa              4.75       784.69        45.99   69242304    4058370
sdab              4.47       784.89        45.81   69259548    4042074
sdad              4.40       784.83        45.80   69254484    4041794
sdac              4.32       784.92        47.64   69262304    4203442
sdae              4.23       784.84        45.64   69255316    4027730
sdaf              4.12       784.88        45.60   69258620    4024146
sdag              4.42       702.93        41.19   62027358    3634226
sdah              4.37       702.73        41.38   62009962    3651746
sdai              4.37       702.79        41.67   62015092    3677450
sdaj              4.35       702.87        41.50   62022040    3661962
sdak              4.35       703.19        41.26   62050556    3640690
sdal              4.37       703.53        40.96   62080556    3614570
sdam              4.34       703.60        40.85   62086828    3604554
sdan              4.33       703.42        41.02   62070532    3620082
sdao              4.34       703.41        41.22   62069532    3637226
sdap              4.32       703.41        41.15   62069548    3631570
sdaq              4.08       703.40        41.29   62069444    3643514
sdar              4.01       703.10        41.58   62042804    3669194
sdas              3.94       702.87        41.55   62021960    3666570
sdat              3.85       703.25        40.92   62055866    3611258
sdau              3.77       703.06        40.93   62039220    3611930
dm-0             16.60        73.91       117.05    6521508   10328224
dm-1              0.05         0.37         0.00      32976        168
dm-2             16.51        73.25       117.04    6463516   10328056
dm-3              5.79        72.21        32.65    6372058    2880648
dm-4             10.60         0.36        84.40      32112    7447384
dm-5              0.04         0.31         0.00      26938         24
md0              54.27         0.56       433.70      49578   38270384
md1              67.95         0.56       543.17      49602   47930328
md2              60.73         0.56       485.41      49594   42832904

Create an iSCSI target.
Once you create the iSCSI target and format the drive with a Windows file system, you have lost any data that was on the drive you created earlier. Remember with iSSCSI you are presenting a target “physical” drive.

Install the iSCSI target utilities.

yum install scsi-target-utils

The iSCSI configuration file.

cat /etc/tgt/targets.conf
default-driver iscsi

# Parameters below are only global. They can't be configured per LUN.
# Only allow connections from 192.168.100.1 and 192.168.200.5
initiator-address 192.168.100.1
initiator-address 192.168.200.5

<target iqn.2012-11.org.eamc:leroy.target0>
	backing-store /dev/md0	
	write-cache off
	lun 11
</target>
<target iqn.2012-11.org.eamc:leroy.target1>
	backing-store /dev/md1
	write-cache off
	lun 12
</target>

Turn on tgtd.

chkconfig iptables off
chkconfig tgtd on
chkconfig tgtd --list

SMARTD
One of the guys on the team brought up that we should be doing some hard drive monitoring to make sure we knew if we were having trouble with a drive. As a result I installed smartmontools and configured the daemon to email when a drive starts to fail.

Install smartmontools.

yum install smartmontools

Edit the configuration file to email, but the first time test to make sure an email is sent.

cat /etc/smartd.conf
DEVICESCAN -a -I 194 -W 4,45,55 -R 5 -m jud@circus.org -M test

Start the smartd daemon.

chkconfig smartd on
service smartd start

Now go back and remove the -M test from the configuration file to make sure you don’t get emails every time the smartd daemon restarts. There are a number of configuration options, so read the /etc/smartd.conf file for a better understanding.

Some random commands:

mdadm --stop /dev/md124
mdadm --remove /dev/md124
mdadm --query --detail /dev/md1
mdadm --detail-platform
mdadm --monitor
mdadm --explain /dev/md0
Posted in Linux | Leave a comment

Jud’s Rules of the Road

Each year, if not each new semester, we get a crop of new cyclists that come
out to ride. Some of them are “One Hit Wonders” and some ride with our group
for four years. They all learn a lot about the road and this is my attempt to
shorten the learning curve.

Pace line technique.

Rule 1: Never stop pedaling. I say that only half jokingly. When you are at
the front of a pace line of any size, two abreast or single file, never stop
pedaling, especially down hill. There are exceptions, long or steep descents,
but they are rare in Auburn. When you stop pedaling you slow down and riders
behind you have to put on their brakes. When was the last time you rode
someone off your wheel on a descent?

Rule 2: Pull through. When you are not as strong as the other riders in the
group you should still pull through even if your pull is only thirty seconds.
The reason is that when rider(s) pull off the front the progression needs to
be preserved. If you pull off with the rider(s) in front it confuses others
in the pace line. Pull through, do a short pull and then tell the riders
around you that you are pulling off. The riders around you can then adjust.

Rule 3: Pull through at the same pace as the person in front of you. On long
rides it’s better to pull at the prevailing pace of the group for ten minutes
than it is to bump up the speed 3 to 5 mph and only pull for 2 minutes. By
pulling longer you are getting a better work out and giving weaker riders a
longer break. At the faster speed you end up dropping someone and then you
have to wait at the next turn or stop sign. It didn’t get you anywhere any
faster. The group can only go the pace of the slowest rider and if you
protect that rider from the wind you will get to your destination faster.

Rule 4: Never half wheel in a two abreast pace line. I’ve never understood
why people have to try and be a half wheel ahead of me in a pace line,
especially when we’re trying to carry on a conversation. I’ve decided it has
to do with either lack of self confidence or lack of cycling etiquette. Which
one is your problem?

Rule 5: In a pace line never ride more than two feet behind someone. This
rule has caveats for speed and noobs. The point of a pace line is to keep a
tight formation for drafting, conversation amongst the group and to allow cars
to get around faster and safer. If you get too strung out you loose all three
of the benefits.

Rule 6: Never ride more than two abreast. If nothing else it’s the law, but
it makes you look like you don’t know the rules of the road or that you are a
rookie. You don’t have to ride next to someone to carry on a conversation or
listen in on one. You just have to ride a tight pace line.

Rule 7: If you consistently get buzzed by cars you are riding too far to the
right. I know it’s counter intuitive but a good friend named Mike Munk told
me that when we rode together in Montgomery, and he is right. If you take up
less of the road cars have little incentive to give you room and will pass you
with oncoming traffic, which is not safe for anyone involved. By moving to
the left you force cars to take you and oncoming traffic into account when
trying to pass.

Rule 8: We only drop our friends. That saying is on the back of a club
jersey from North Carolina and is not always right. Friendships are more
important than your average speed for a ride. Ten years ago I might have
argued this point with you but today I realize its truth. Some day you will
need help and others are more likely to help you if you are nice to them as
well.

Rule 9: Wait for riders in your group. This seems like a rehash of the rule
above but it’s not. Groups naturally form on the road, people of similar
ability find it more enjoyable and easier to ride together. Often it is
better to keep these groups rather than one large group. Don’t be afraid to
split a group into similar abilities, just make sure everyone is accounted for
and has a buddy.

Rule 10: Always carry a spare tube that fits the wheels you are riding. If
you need a long stem tube, make sure you have one.

Riding Hills:

Rule 11: Ride into the hill. Riders see a hill coming and slow down trying
to prepare for it. You will only get stronger riding into the hill rather
than being intimidated by it.

Rule 12: If youíre going to get dropped on a hill, move to the back of the
group or the far left before the hill starts. You will not disrupt the pace
line and if the group has read this list of rules you won’t get left either.

Rule 13: If you have to stand on a hill keep constant pressure on your pedals
when you stand. I have nearly been wrecked countless times by some rookie
throwing their bike when they stand on a hill because they just don’t know any
better. You can tell what caliber of rider someone is by how well they stand
on a hill.

Rule 14: If you need to shift while sitting on a grinding hill give two hard
pedal strokes, then soft pedal and shift. Less tension on the chain makes it
easier to shift.

Rule 15: If you need to shift while standing on a hill, shift on the apex of
the rocking motion. When standing I shift when my bike has been rocked to the
far right and my left pedal is not fully weighted. There is less tension on
the chain and it shifts easier.

Snot rockets:

Rule 16: Snot only at the back of the pace line or into your glove. I’ve
been snotted on enough to make me sick, don’t give someone a reason to tell
you off or get even.

Rule 17: Spit only at the back of the pace line or onto your shoe. You can
actually spit in a pace line if your careful about the wind and where you
spit, but most people don’t think hard enough about it. If you aim for your
shoe when you spit, traveling fast enough you will neither spit on yourself
nor on the person behind you.

Posted in Cycling, Thoughts | Leave a comment

Hacking the 7926G MIDlet Deployment

I really struggled with the deployment of MIDlets to the phone. What should have been simple, became convoluted because I did not change a mime type on the web server. As a result I emailed back and forth with my Cisco team to finally get it all worked out.

At first I developed my MIDlet on the emulator and tried to deploy it. When I ran into problems I was not sure if the errors were from the MIDlet or the environment, so I stepped back and tried to deploy a Cisco sample MIDlet. As a result I am going to walk through deploying the Cisco supplied DeviceSpecifics MIDlet. This is because you know it will work correctly on a 7926G so any errors or problems are of your own creation.

Phone Trace Settings
First you have to enable access to the phone from it’s web server. IN CUCM select the phone from the phone list and under “Product Specific Configuration Layout” set “Web Access” to “Full” and “Phone Book Web Access” to “Allow Admin.” You login will be username admin and password Cisco.

7926g-product-specific-configuration-layout

Next let’s set up the phone to log to your favorite logging server. Bring up your phone in a browser and choose Trace Settings. I set the “Java Module Trace Level” to “Debug” and checked the box to “Enable Remote Syslog.”

7926g-remote-logging
Web Server Setttings
Page 37 of the Developers Guide explains how to set up different web servers. I am using Apache on RHEL 5.6, so I made change to /etc/mime.types. When I first added the stanza to the file I did not notice that there was a second definition for “jar” and it wreaked havoc on my trouble shooting for days.

Add the following and be sure to comment out or removed the second definition.

text/vnd.sun.j2me.app-descriptor jad
application/java-archive jar
#application/x-java-archive     jar

If you do not remove that definition you will get an error similar to the one below in the logs from the phone if you set the logging up right. I did not have the logging set to a high enough setting to catch these errors.

May  2 12:39:12 172.22.16.10  SEP-java: *********************Installer - xmlVendor = Cisco Systems, Inc.
May  2 12:39:12 172.22.16.10  SEP-java: Installer - info.suiteVendor = Cisco Systems, Inc.
May  2 12:39:12 172.22.16.10  SEP-java: 1. Vendor Name are equal
May  2 12:39:12 172.22.16.10  SEP-java: Installer - info.suiteVersion = 0.0.1
May  2 12:39:12 172.22.16.10  SEP-java: TEST - Installer - checkPreviousVersion - there is no previous version - RETURN
May  2 12:39:12 172.22.16.10  SEP-java: Installer step:2
May  2 12:39:12 172.22.16.10  SEP-java: Installer step:3
May  2 12:39:12 172.22.16.10  SEP-java: Installer step:4
May  2 12:39:12 172.22.16.10  SEP-java: Installer step:5
May  2 12:39:12 172.22.16.10  SEP-java: microedition.profiles is Native System Property
May  2 12:39:12 172.22.16.10  SEP-java: microedition.configuration is Native System Property
May  2 12:39:13 172.22.16.10  SEP-java: MIDletProxyList:setForegroundMIDlet()
May  2 12:39:13 172.22.16.10  SEP-java: MIDletProxyList:setForegroundMIDlet()
May  2 12:39:13 172.22.16.10  SEP-java: MIDletProxyList:setForegroundMIDlet()
May  2 12:39:13 172.22.16.10  SEP-java: REPORT: <level:3> <channel:1000> ** Error installing suite (38): JAR did not have the correct media type, it had application/x-java-archive

And even though it does not like the JAR file it goes ahead and adds it to the MIDlet list, as seen from below. That is the reason I thought it was being installed, because you see it in the list on the phone, and see it access the .jar on the web server.

May  2 12:39:21 172.22.16.10  SEP-java: Provisioning=0
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[0]:Missed Calls
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[1]:Voicemail
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[2]:Received Calls
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[3]:Placed Calls
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[4]:Personal Directory
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[5]:Corporate Directory
May  2 12:39:21 172.22.16.10  SEP-java: dump_svc_cfg: name[6]:DeviceSpecifics

And from the web server.

172.22.16.10 - - [02/May/2012:12:43:14 -0500] "GET /7926G/DeviceSpecifics/DeviceSpecifics.jad HTTP/1.1" 200 279 "-" "Profile/MIDP-2.0 Configuration/CLDC-1.1"
172.22.16.10 - - [02/May/2012:12:43:16 -0500] "GET /7926G/DeviceSpecifics/DeviceSpecifics.jar HTTP/1.1" 200 6326 "-" "Profile/MIDP-2.0 Configuration/CLDC-1.1"

.JAD File
Page 32 of the Developer’s Guide walks you through setting up a .jad for distribution. It will be located in the “dist” directory of the DeviceSpecifics NetBeans project. According to the Cisco documentation this is how the .jad file matches up to the IP Phone Service Coniguration found under
Device
–> Device Settings
—–> Phone Services

Service Name and MIDlet-Name must match exactly, including case and white space.

Service Version must match MIDlet-Version, or it can be left blank. If left empty the phone will attempt to download the JAD file every time it registers with CUCM as well as every time the MIDlet launches. If there is a version number, it will only download if the version number changes.

Service Type is set to Standard IP Phone Service.

Service Category is set to Java MIDlet.

Service URL is the URL where the .jad file is hosted.

ASCII Service Name and Service Description do not impact whether a MIDlet will run on a phone.

# cat DeviceSpecifics.jad
MIDlet-1: DeviceSpecifics,,com.cisco.sdk.specifics.PlatformMIDlet
MIDlet-Jar-Size: 6326
MIDlet-Jar-URL: DeviceSpecifics.jar
MIDlet-Name: DeviceSpecifics
MIDlet-Vendor: Cisco Systems, Inc.
MIDlet-Version: 0.0.1
MicroEdition-Configuration: CLDC-1.1
MicroEdition-Profile: MIDP-2.0

 

7926g-service-configuration

Install MIDlet
Finally we tell the phone to download the MIDlet. From the upper right of the Phone Configuration page, choose:
Related Links
–> Subscribe Unsubscribe Services
—–> Go

7926g-subscribed-ip-phone

You should now have the DeviceSpecifics MIDlet installed on your 7925 or 7926 IP phone.

Posted in Code, Routing | Leave a comment

Hacking the 7926G MIDlet Development

Requirements
The hospital is forming our plan to handle closed loop medication administration. What that means to a patient is that their armband will scanned to verify who they are, then the medications will be scanned to verify what they are, then the medications will be administered. We are debating what hardware to purchase and how many items. This fall and winter we are due to upgrade our nurse call phones and the 7926G has become the front runner. The nurses carry it with them every where they go, and they are well cared for devices.

My boss came up with the idea of how to use the phones with our current crop of tablets. There will be a barcode on every computer with its’ name. The nurse will scan the barcode of their computer to associate their phone to the computer, then any scanned data will be sent to that associated computer. My boss wrote the listener for the Windows tablets and I wrote the code for the 7926G.

Below is an approximation of what I drew out on a sticky note. The goal was for the PC team to be able to trouble shoot the phone scanning program quickly and easily without intervention from third level help desk.
7926g-requirements

Test Server
In order to test I built this simple server to communicate with the phone while my boss worked on the Windows receiver.

cat SimpleServer.pl
#!/usr/bin/perl

use strict;
use warnings;

package SimpleServer;

use base qw(Net::Server);

sub process_request
{
        #print qq(OK\n);
    while (<STDIN>) {
        s/\r?\n$//;
        print STDERR "Received [$_]\n";
        print qq([OK]\n);
        last if /quit/i;
    }
}

SimpleServer->run(port => 3000);

Scanner MIDlet
You can download my source code here.

Creating a MIDlet project in NetBeans is left as an exercise for the reader, however, in order to be able to test the scanner on the emulator you need to add the shim to the project resources.

7926g-resources

In order to deploy the MIDlet, do not include the libraries for the scanner shim.

7926g-properties-libraries

Below are screen shots from the Scanner MIDlet.

The opening screen, a nurse scans the computer that they want to send data.
7926g-initial-screenshot

The result of a scanned barcoded IP address.
7926g-scan-result

The Help screen.
7926g-help-screen

For trouble shooting purposes I added a form for manual entry of the traget PC IP adress or hostname.
7926g-manual-entry

Also for trouble shooting purposes here is an alert with the phones’ IP address.
7926g-phone-ip-address

More trouble shooting, this time the target PC IP address.
7926g-pc-ip-address

Finally, there is checking to make sure that the target PC is actually listening. I catch the error and display this message for the end user.
7926g-pc-error

This was a fun project, so much so that I spent more than a couple weekends working on it. I would like to thank a number of people from Cisco and our partner. Thanks to David Staudt, Riley Marsh, Louis Bell, Tony Godwin, Jim Stewart, Mike Hamblett, Jim Hooker and Conrad Price. They worked as a team to make sure I was able to develop this application quickly.

Posted in Code, Routing | Leave a comment

Congestion Management

First a comment on the structuring of my notes. In order to make them more legible I have started using headings for each section and then bolding the subsections. I believe it makes it easier to read and I do go back and use these notes to study for the actual test.

Queuing Concepts

A queue organizes packets packets waiting to exit an interface, the size of the queue affects delay, jitter and loss.

  • A longer queue decreases the chance of tail drop but increases average delay and typically increases jitter as well.
  • A shorter queue increases the chance of tail drop but decreases the average delay and typically decreases jitter.
  • If the congestion is sustained for long periods of time drops will be just as likely no matter the queue length.

Hardware Queueor TX Ring

If space is available in the hardware queue no output queuing is performed on a packet. It is only with congestion on the hardware queue that software queues are used.

  • Hardware queues always perform FIFO scheduling and cannot be changed.
  • The hardware queue uses one single queue per interface.
  • IOS shortens the hardware queue automatically when a software queue is applied.
  • The hardware queue length can be configured to a different value.

The command show controllers interface shows information about the hardware queue.

R4#sh controllers s0/0/0
Interface Serial0/0/0
Hardware is GT96K
DTE V.35 TX and RX clocks detected.
idb at 0x65EF56B4, driver data structure at 0x65EFCE60
wic_info 0x65EFD484
Physical Port 0, SCC Num 0
MPSC Registers:
MMCR_L=0x000304C0, MMCR_H=0x00000000, MPCR=0x00000000
CHR1=0x00FE007E, CHR2=0x00000000, CHR3=0x0000064A, CHR4=0x00000000
CHR5=0x00000000, CHR6=0x00000000, CHR7=0x00000000, CHR8=0x00000000
CHR9=0x00000000, CHR10=0x00003008
SDMA Registers:
SDC=0x00002201, SDCM=0x00000080, SGC=0x0000C000
CRDP=0x160FEA50, CTDP=0x160FECD0, FTDB=0x160FECD0
Main Routing Register=0x0003FFC0 BRG Conf Register=0x00480000
Rx Clk Routing Register=0x76543288 Tx Clk Routing Register=0x76543219
GPP Registers:
Conf=0x43030002, Io=0x46064250, Data=0x7B7BBDA9, Level=0x180000  
Conf0=0x43030002, Io0=0x46064250, Data0=0x7B7BBDA9, Level0=0x180000  
0 input aborts on receiving flag sequence
0 throttles, 0 enables
0 overruns
0 transmitter underruns
0 transmitter CTS losts
23 rxintr, 28 txintr, 0 rxerr, 0 txerr
52 mpsc_rx, 0 mpsc_rxerr, 0 mpsc_rlsc, 6 mpsc_rhnt, 47 mpsc_rfsc
6 mpsc_rcsc, 0 mpsc_rovr, 0 mpsc_rcdl, 0 mpsc_rckg, 0 mpsc_bper
0 mpsc_txerr, 29 mpsc_teidl, 0 mpsc_tudr, 0 mpsc_tctsl, 0 mpsc_tckg
0 sdma_rx_sf, 0 sdma_rx_mfl, 0 sdma_rx_or, 0 sdma_rx_abr, 0 sdma_rx_no
0 sdma_rx_de, 0 sdma_rx_cdl, 0 sdma_rx_ce, 0 sdma_tx_rl, 0 sdma_tx_ur, 0 sdma_tx_ctsl
0 sdma_rx_reserr, 0 sdma_tx_reserr
0 rx_bogus_pkts, rx_bogus_flag FALSE 
0 sdma_tx_ur_processed

tx_limited = 0(128), errata19 count1 - 0, count2 - 0

In the above listing, see the line tx_limited = 0(128), errata19 count1 – 0, count2 – 0 at the bottom of the output. This hardware queue holds 128 packets and the 0 means the queue size is not limited by a queuing tool on this interface.

Enable priority queuing to change the hardware queue length.

R4#   
R4#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R4(config)#int s0/0/0
R4(config-if)#priority-group 1
R4(config-if)#do sh controllers s0/0/0
Interface Serial0/0/0
Hardware is GT96K

Output removed for brevity.

tx_limited = 1(2), errata19 count1 - 0, count2 - 0

After enabling priority queuing with the priority-group command you can see that the new length of the hardware queue is (2) and the 1 means the length is limited as a result of queuing being configured.

The hardware queue length can be changed with the tx-ring-limit x command as seen below, this was done with the priority queue still active.

R4(config-if)#tx-ring-limit 50
R4(config-if)#do sh controllers s0/0/0
Interface Serial0/0/0
Hardware is GT96K

Output removed for brevity.

tx_limited = 1(50), errata19 count1 - 0, count2 - 0

Queuing on Inerfaces, Subinterfaces and Virtual Circuits

Traffic is not even placed in a software queue unless the hardware queue is full, however traffic shaping traffic shaping can cause shaping queues to fill even when when there is no congestion on the physical interface. In effect traffic shaping on the sub interfaces creates congestion between the shaping queues and the physical interface software queues. On a physical interface traffic can only leave at the speed of the physical clock rate, similarly packets can only leave a shaping queue at the traffic-shaping rate.
QoS p.263

Scheduling Concepts

For the test we just need to know the basic concepts of FIFO, PQm CQ and MDRR.

FIFO
FIFO uses tail drop to decide when to drop or enqueue packets. As above, the same holds true for FIFO, a longer queue decreases the chance of tail drop but increases average delay and typically increases jitter as well. A shorter queue increases the chance of tail drop but decreases the average delay and typically decreases jitter.

Configuring FIFO actually requires you to turn off all other types of queuing.

From my example above I am still using priority queuing from above.

R4(config-if)#do sh int s0/0/0
Serial0/0/0 is up, line protocol is up 
  Hardware is GT96K Serial
  MTU 1500 bytes, BW 1544 Kbit/sec, DLY 20000 usec, 
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation FRAME-RELAY, loopback not set
  Keepalive set (10 sec)
  CRC checking enabled
  LMI enq sent  206, LMI stat recvd 206, LMI upd recvd 0, DTE LMI up
  LMI enq recvd 0, LMI stat sent  0, LMI upd sent  0
  LMI DLCI 1023  LMI type is CISCO  frame relay DTE
  FR SVC disabled, LAPF state down
  Broadcast queue 0/64, broadcasts sent/dropped 4/0, interface broadcasts 1
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters 00:34:20
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
  Queueing strategy: priority-list 1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  Output queue (queue priority: size/max/drops):
     high: 0/20/0, medium: 0/40/0, normal: 0/60/0, low: 0/80/0
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
     206 packets input, 5846 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     210 packets output, 3736 bytes, 0 underruns
     0 output errors, 0 collisions, 8 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions
     DCD=up  DSR=up  DTR=up  RTS=up  CTS=up

So I remove the priority queue and you can see that the interface is back to default, weighted fair queuing.

R4(config-if)#no priority-group 1     
R4(config-if)#do sh int s0/0/0   
Serial0/0/0 is up, line protocol is up 
  Hardware is GT96K Serial

Output removed for brevity.

  Queueing strategy: weighted fair
  Output queue: 0/1000/64/0 (size/max total/threshold/drops)

To make the interface FIFO queuing I have to remove WFQ.

R4(config-if)#no fair-queue 
R4(config-if)#do sh int s0/0/0
Serial0/0/0 is up, line protocol is up 
  Hardware is GT96K Serial

Output removed for brevity.

  Queueing strategy: fifo
  Output queue: 0/40 (size/max)

You can change the default queue length with the command hold-queue x out.

R4(config-if)#hold-queue 20 out
R4(config-if)#do sh int s0/0/0 
Serial0/0/0 is up, line protocol is up 
  Hardware is GT96K Serial

Output removed for brevity.

  Queueing strategy: fifo
  Output queue: 0/20 (size/max)

Priority Queuing (PQ)
With priority queuing the highest priority queues are always serviced first. There are four queues, High, Medium, Normal and Low. If the High queue has a packet it is serviced, if not the the Medium queue is serviced and on down to the Low priority queue. The process always starts back at the High queue. As a result the lower priority queues get starved. This fact makes it an unpopular queuing choice.

Custom Queuing (CQ)
Custom queuing addresses the largest drawback of PQ, servicing all queues even during congestion. It has 16 queues available, implying 16 classification categories. It does not have the option to service one queue first, and does round-robin service on each queue, beginning with the first queue. CQ takes packets from that queue until the total byte count specified for that queue has been met or exceeded. After that queue has been serviced or does not have nay more packets, CQ moves on to the next queue and repeats the process.

The CQ scheduler essentially guarantees the minimum bandwidth for each queue, while allowing queues to have more bandwidth under the right conditions. If 5 queues have been configured with the byte counts of 5,000, 5,000, 10,000, 10,000 and 20,0000 for queues 1 through 5, The percentage bandwidth given to each queues is 10, 10, 20, 20, and 40 percent. But if queue 4 has no traffic over a short period of time, the CQ scheduler moves to another queue. Only queues 1-3 and 5 have packets waiting so the distribution is changed. The queues would receive 12.5, 12.5, 25, 0 and 50 percent of the bandwidth.

The queues are numbered not named and no queue gets better service than another.
QoS p.270

Modified Deficit Round-Robin (MDRR)
MDRR is similar to the CQ scheduler in that it reserves a percentage of link bandwidth for a particular queue. MDRR removes packets from a queue until the quantum value (QV) for that queue has been removed. MDRR repeats the process for every queue in order from 0 through 7. Any extra bytes sent during this process are treated as a deficit and subtracted from the QV for the next pass. As a result MDRR provides an exact bandwidth reservation.

Concepts and Configuration WFQ, CBWFQ and LLQ

Weighted Fair Queuing (WFQ)
WFQ does not allow classification options to be configured. It classifies packets based on flow. A flow consists of all packets that have the same source and destination IP address, and the same source and destination port numbers. WFQ also favor low-volume higher-precedence flows over large-volume lower-precedent flows. Each flow uses a different queue and up to a maximum of 4096 queues per interface. It also uses modified tail-drop. WFQ may be the most deployed QoS tool on Cisco routers so take your time on this section.

WFQ can be seen as being too fair, with many flows WFQ will give some bandwidth to every flow. WFQ is also a poor choice for voice and interactive video traffic because both need low delay and low jitter. By being too fair it can starve voice and video.

Flows
Flows are identified by five items in a packet.

  • Source IP address
  • Destination IP address
  • Transport layer (TCP or UDP)
  • Source port
  • Destination port
  • IP Precedence

Flows are considered to exist only as long as packets from the flow exist. If there is a break in traffic and no packets are in the queue, it is removed. The show queue command tells about the WFQ’s view of a flow.

WFQ Scheduler
WFQ has two goals:
1. To provide fairness among the existing flows, giving each flow an equal amount of bandwidth. With each flow receiving the same bandwidth lower volume flows prosper while higher volume flows suffer.
2. Provide more bandwidth to flows with higher IP precedence values. The “weight” of WFQ is based on precedence. WFQ provides a fair share of the link bandwidth based on each flows precedence, plus one. Precedence 7 flows get 8 times more bandwidth than precedence 0 flow because (7+1)/(0+1) = 8.

When adding packets to the hardware queue WFQ puts the packet with the lowest sequence number (SN) among all of the queues or flows.

SN calculation:
SN = Previous_SN + (weight * new_packet_length)
Weight = 32,384 / (IP_Precedence + 1)

SN = Previous_SN + ((32,384 / (IP_Precedence + 1)) * new_packet_length)

WFQ calculates the SN before adding a packets to its queue and even before the decision is made to drop the packet because it is based on modified tail drop. The formula considers the length of the packet, the weight of the flow, and the previous SN. By considering the packet length the SN calculates a higher SN for larger packets and a lower number for smaller packets. By including the SN of the previous packet, the formula assigns a larger SN to queues that already have a number of packets enqueued.

WFQ always weights the packets based on the first 3 bits of the ToS byte, the Precedence field.

The larger the precedence value, the lower the weight, making the SN smaller and therefore favoring that flow over another with a lower precedence value.

Precedence Weight
0 32384
1 16192
2 10794
3 8096
4 6476
5 5397
6 4626
7 4048

WFQ Drop Policy, Number of Queues and Queue Length
WFQ places and absolute limit on the number of packets in all queues called the hold-queue limit. If a new packet arrives for any queue and the hold-queue limit has been reached, the packet is discarded.

WFQ also places a limit on individual queues called the congestive discard threshold (CDT). If the individual queues CDT has been reached WFQ looks for a packet with a higher calculated SN in all of the queues. If a packet is found with a higher SN it is discarded the this packet is enqueued.

WFQ also keeps eight hidden queues for overhead traffic generated by the router. WFQ uses a very low weight for these hidden queues in order to give precedence to overhead traffic.

WFQ Configuration
IOS uses WFQ by default on all serial interfaces with bandwidths set at T/1 or E/1 speeds and below. To turn on WFQ use the command fair-queue.
To change the hold queue of an interface use the command hold-queue x out.

Class-Based WFQ (CBWFQ)
CBWFQ uses MQC to classify traffic so anything you can match with MQC you can match with CBWFQ. It can reserve a minimum amount of bandwidth for each queue and can give an actual percentage of the bandwidth.

CBWFQ supports both tail drop and WRED. There are 64 queues available in CBWFQ and WRED can be enabled on any of them. WRED works well for less drop-sensitive traffic such as data but is not a good choice for voice and video.

If a packet is not classified in CBWFQ it goes to the class-default queue. Inside the class-default queue CBWFQ can use either FIFO or WFQ. With WFQ it uses the SN calculation within that queue just like WFQ normally does. Using WFQ in the default class is an advantage for CBWFQ because WFQ treats low-volumes flows well and they are likely to be interactive traffic. So with CBWFQ the traffic you know, classify and reserve the proper bandwidth. For traffic you cannot characterize, let it default into the class-default queue where WFQ dynamically applies fairness by using WFQ.

The CBWFQ scheduler gives a percentage of bandwidth to each class based on the configured values, although the algorithm is not published.

Delay and jitter sensitive traffic still suffer with CBWFQ because other queues can still be serviced while those packets wait.

CBWFQ Configuration
This is an example take from QoS p. 298.

    The criteria:

  • All VOIP payload traffic has been marked with DSCP EF, placed in a queue.
  • All other traffic has been marked with DSCP BE, place in a different queue.
  • Give the VOIP trassif 58 kpbs of bandwidth on the link.
  • Use WRED and WFQ in the non-VOIP traffic.
class-map match-all voip-rtp
	match ip ftp 16384 16383

class-map match-all dscp-ef
	match ip dscp 
! This is the input policy-map.
policy-map voip-be
	class voip-rtp
		set ip dscp ef
	class class-default
		set ip dscp 0

! This is the output policy-map.
policy-map queue-on-dscp
	class descp-ef
		bandwidth 58
		queue-limit 30
	class class-default
		! WRED
		random-dectect dscp-based
		! WFQ
		fair-queue

interface Ethernet 0/0
	service-policy input voip-be

inerface serial 0/0
	service policy output queue-on-dscp

Low Latency Queueing
LLQ is an option of CBWFQ applied to one or more classes. CBWFQ treats these classes as a strict priority and always services packets in these classes if a packet is waiting. Therefore if you use CBWFQ and use the priority command you have enabled LLQ. This overcomes the biggest drawback of CBWFQ, having a packets with a lower SN but with less latency sensitivity being sent. With LLQ, priority queues are serviced first while guaranteeing bandwidth for traffic in other queues.

LLQ actually polices the priority queue based on the configured bandwidth. The packets in the PQ still have low latency, but LLQ prevents that queue from consuming more than its configured amount. The policing functions of LLQ takes care f protecting the other queues from the LLQ, discarding packets when needed.

LLQ Configuration
Configuration of LLQ is similar to that of CBWFQ, but instead of using the bandwidth command, use the priority command. The priority command sets the guaranteed minimum bandwidth as well as the maximum bandwidth.

Please note, the example below is based on QoS p. 308, however, I have made considerable changes in my answer. If it is not correct I am to blame, not the authors.

    The criteria:

  • R3’s S0/0 is clocked at 128 kbps and is the output interface.
  • R3’s input interface is Ethernet 1/0.
  • VOIP payload is marked with DSCP EF, and placed in its own queue, using tail drop. This class get 58 kbps and is the LLQ.
  • NetMeeting voice and video Server1 to Client1 is marked with DSCP AF41, and placed in its own queue, using tail drop. It get 22 kbps.
  • Any HTTP traffic with “important” in the URL is marked with AF21 and placed in its own queue. The class get 29 kbps.
  • Any HTTP traffic with “not-so” in the URL is marked with AF23 and placed in its own queue. The class get 8 kbps.
  • All other traffic is marked with DSCP BE and placed in its own queue with WRED and WFQ. This class get the remaining 20 kbps.

You can have multiple low-latency queues in a single policy map, and with multiple LLQs each class is policed at the configured rate. You get more granularity in what you police

! All of this is to classify incoming traffic.
! ip cef is for NBAR
ip cef 
class-map match-all dscp-ef
	match ip dscp ef

class-map match-any dscp-af41
	match ip ftp 16383 16384
	match access-group 101

class-map match-all important
	match protocol http "*important*"

class-map match-all not-so
	match protocol http *not-so*"

policy-map incoming-traffic
	class dscp-ef
		set dscp ef
	class dscp-af41
		set dscp af41
	class important
		set dscp af21
	class not-so
		set dscp af23
	class class-default
		fair-queue
		random-detect dscp-based

policy-map outgoing-traffic
	class dscp-ef
		priority 58
	class dscp-af41
		bandwidth 22
	class important
		bandwidth 29
	class not-so
		bandwidth 8
	class class-default
		! This bandwidth command not needed.  
		bandwidth 20
		random-detect dscp-based
		fair-queue

interface ethernet 1/0
	! Output omitted for brevity.
	ip nbar protocol-discovery
	policy-map input incoming-traffic

interface serial 0/0
	! Output omitted for brevity.
	bandwidth 128
	policy map output outgoing-traffic 
Posted in Routing | Leave a comment

Classification and Marking

Classification and Marking

  • Classify and mark as close to the ingress edge as possible.
  • Mark or re-mark traffic when it reaches a trusted device in the network.
  • There are only two IP QoS marking fields that are carried end to end, Precedence and DSCP, mark one of these fields to maximize the benefits of reducing classification overhead.

Class-Based Marking
Service classes are different types of traffic that have been marked to receive better or worse service. Class-based marking (CB-Marking) examines the packet and classifies it into a service class.
Fields that can be examined for classification:

  • IP ACLs
  • Any markable fields
  • Input interface
  • MAC Addresses
  • All NBAR-enabled fields

Fields that can be marked:

  • IP Precedence
  • DSCP
  • 802.1P CoS
  • ISL Priority
  • ATM CLP
  • Frame Relay DE
  • MPLS Experimental
  • QoS Group

You can mark the Precedence and DSCP fields with any valid binary value of either 3 or 6 bits respectively. Precedence fields should grow in importance and QoS behavior as the number gets higher. DSCP differs in that the larger number does not always get better QoS treatment.

Marking
Marking happens primarily in CoS, TOS, Precedence and DSCP.

IP ToS Byte — The 1 byte field in the IP header that was originally defined for QoS in RFC 791 released in 1981. It includes a 3 bit Precedence field and 4 ToS bits. p. 117 QoS

IP Precedence — Bits 0,1,2 of the IP ToS byte as defined by RFC 791.

TOS Field — Bits 4,3,5,6 of the IP Tos Byte as defined by RFC 791.

LAN CoS — Layer 2 marking. Refers to two different fields inside either the 802.1Q trunking header or a field inside the ISL header. Trunking or ISL must be enabled for the CoS field to exist. As soon as the packet is Layer 3 forwarded, either by a router or a layer 3 switch, the old LAN header gets discarded and the CoS field with it. p. 201 QoS
ISL — Uses the 3 least significant bits.
802.1Q — Uses the 3 most significant bits.

IP DSCP — Contained in the first 6 bits of the DS field in the IP header, which replaced the ToS byte. DiffServ defines 8 class selector DSCP values for backward compatibility with IP precedence.
Cisco Recommended Values for Marking

Traffic Tyoe CoS Precedence SCP
Voice Payload 5 5 EF
Voice Payload 4 4 AF41
Voice Payload 4 4 AF41
Voice/Video Signaling 3 3 CS3
Mission Critical Data 3 3 AF31

AF32

AF33

Transactional Data 2 2 AF21

AF22

AF23

Bulk Data 1 1 AF11

AF12

AF13

Best Effort 0 0 BE
Scavenger 0 0 2

4

6

The order of the class commands inside the policy-map is important. Each packet is compared to each class’s matching criteria in order, and once the first match is made the packet is considered to be in that class. So, the order of the class impact the logic of the policy-map. Order also matters in the in regards to CPU cycles. If the last rule matches most of the traffic, it will require more CPU.

Class-map names are code sensitive. It is recommended to stick with the same style or naming convention such all lower case, ALL UPPER CASE or HumpBack.

The class-map has two optional keywords after the name of the class-map, namely match-all or match-any because you can use multiple match statements. The match-all command is the default when nothing specified.

Match statements
There are differing amounts of values IOS will match depending upon the field that has been marked.

  • The match subcommand under class-map can be used to match up to four IP Precedence values in one command, for instance, match ip precedence 0 1 2.
  • Up to eight DSCP values can be matched with the match ip dscp subcommand.
  • Four CoS values can be matched.

NBAR
NBAR can give statistical information about traffic mix as well as recognition of traffic that uses dynamic ports. When the match protocol command is given, the traffic is being matched by NBAR.

CEF forwarding must be enabled if using NBAR matching inside a policy map.

Configuration
1. Classify packets into service classes using the match command inside an MQC class map.
2. Mark the packets in each service class using the set command inside an MQC policy map.
3. Enable the CB marking logic using the service-policy command under the interface.

A simple example:
All voice traffic should be marked with DSCP EF.
All other traffic should be marked with DSCP default.

ip cef

class-map match-all voip-rtp
	match ip ftp 16384 16383

policy-map voip-and-be
	class voip-rtp
		set dscp ef
	class class-default
		set dscp default

int fa0/0
	service-policy input voip-and-be

The show commands for confirmation and troubleshooting of the configuration:

show policy-map
show policy-map interface
Posted in Routing | Leave a comment

Classification and Marking

The difference between classification and marking is action. Classification tools categorize packets while marking changes packet headers. These tools lay the foundation upon which the rest of QoS is built.

Classification — Perform classification closest to the source as possible is the most efficient use of network resources.

Marking — Marking is performed after classification, how it is marked depends upon the layer.

Layer 2 Marking:

  • CoS — Used on ISL or 802.1Q header
  • EXP — MPLS header
  • DE — Frame relay header
  • CLP — ATM cell header

Layer 3 Marking:

  • IP Precedence — RFC 791, first 3 bits of the ToS byte.
  • DSCP IP Header — RFC 2474 and 2475, first 6 bits of the ToS byte.

Layer 2 Class of Service (CoS):

Ethernet frame 802.1Q/P uses the 3 bits from the PRI field, which make up 8 possible values.

CoS Name Application
000 Routine Best-Effort Data
001 Priority Medium Priority Data
010 Immediate High Priority Data
011 Flash Call Signaling
100 Flash Override Video Conferencing
101 Critic/ECP/Critical Voice Bearer
110 Internetwork Control Internetwork Control
111 Network Control Network Control

Frame relay uses the discard eligible (DE) bit to tell a router whether the frame can be dropped, 1 == discard eligible, 0 == should not be dropped.

ATM cells has the cell loss priority field, 1 == discard eligible, 0 == should not be discarded.

Layer 2 1/2:
MPLS packets have the EXP field within the MPLS header which is compatible with the 3 bit PRI/CoS field of the 802.1Q header. The CoS field can be copied into the MPLS EXP field or, a service provider can designate their own EXP value, leaving the customer’s intact in the IP header field.

Layer 3:
RFC 791 called the 3 most significant bits of the ToS byte the IP Precedence bits. It was the predecessor to Differentiated Services Code Point (DSCP) which uses 6 bits of the ToS byte to classify traffic, the remaining two bits of DSCP are for Explicit Congestion Notification (ECN).

DSCP is backward compatible with IP Precedence, however, it has more options for classification.

Because DiffServ does not signal along the path like IntServ, each hop has it’s own behavior based upon the DSCP which are called Per-Hop Behaviors (PHB).

DSCP defines four PHBs:

    • Class selector PHB — The 3 least significant DSCP bits set to 000, provides backward compatibility with ToS based IP Precedence.

 

    • Default PHB — The 3 most significant bits set to 000, this is best effort or when a packet has not been marked.

 

    • Assure Forwarding (AF) PHB — Defines four queues with reserved bandwidth for each queue. When congestion occurs for a queue packets are dropped to avoid tail drop based on their drop precedence. Lower AF drop precedence provides better QoS within each AF class.
      Low Drop Probability Within Class Medium Drop Probability within Class High Drop Probability within Class
      Name/Decimal/Binary Name/Decimal/Binary Name/Decimal/Binary
      Class 1 AF11 / 10 / 001010 AF12 / 12 / 001100 AF13 / 14 / 001110
      Class 2 AF21 / 18 / 010010 AF22 / 20 / 010100 AF23 / 22 / 010110
      Class 3 AF31 / 26 / 011010 AF32 / 28 / 011100 AF33 / 30 / 011110
      Class 4 AF41 / 34 / 100010 AF42 / 36 / 100100 AF43 / 38 / 100110

 

  • Expedited Forwarding (EF) PHB — Provides low delay service to packets with the DSCP field set to 101110 or a decimal value of 46.

QoS Service Class

  1. Identify network traffic and its requirements.
  2. Divide traffic into classes.
  3. Define QoS policies for each class

Cisco recommended mappings between CoS, DSCP IP precedence markings:

AutoQoS Class Layer 2 CoS or

IP Precedence

DSCP Value in Decimal DSCP Value in Binary Code Name
Best Effort 0 0 000000 BE

(Best Effort)

Scavenger 1 8 001000 CS1

(Class Selector 1)

Bulk Data 1 10

12

14

001010

001100

001110

AF11

AF12

AF13

Network Management 2 16 010000 CS2

Class Selector 2

Telephony Signaling 3 26 011010 AF31
Local Mission Crtiical 3 28

30

011100

011110

AF32

AF33

Streaming Media Traffic 4 32 100000 CS4

Class Selector 4

Interactive Video Traffic 4 34

36

38

100010

100100

100110

AF41

AF42

AF43

Interactive Voice Traffic 5 46 101110 EF

Trust Boundaries — The trust boundary is the perimeter where you classify data and do not reclassify QoS markings after that point. The trust boundary should be as close to the source as possible taking into account the ability of the device.

Network Based Application Recognition (NBAR):
NBAR has some built in traffic recognition and can expand the number of packets it recognizes by using Packet Description Language Models (PDLMs) published by Cisco.
Can be used for:

  • Protocol discovery — Used to learn and report on the types of traffic passing through an interface. NBAR uses subport classification, it looks into the payload of the packet and classifies based on content.
  • Traffic classification — NBAR can use deep packet inspection to classify traffic based on URL, MIME type or hostname.
  • Traffic statistics collection — NBAR reports traffic statistics by protocol as shown below:
    circus-rtr#sh ip nbar protocol-discovery 
    
     GigabitEthernet0/1/0
                                Input                    Output
                                -----                    ------
       Protocol                 Packet Count             Packet Count
                                Byte Count               Byte Count
                                5min Bit Rate (bps)      5min Bit Rate (bps)
                                5min Max Bit Rate (bps)  5min Max Bit Rate (bps)
       ------------------------ ------------------------ ------------------------------
       secure-http              45804031                 51160464
                                14439692115              45672201126
                                2000                     1000
                                5249000                  2207000
       http                     426396714                578778999
                                54201282821              812650380836
                                2000                     372000
                                4309000                  3087000
       ftp                      689880                   771488
                                467904677                812190544
                                0                        0
                                802000                   1798000
       ssh                      71666                    95757
                                11923882                 103359890
                                0                        0
    

NBAR Limitations:

  • Cannot function on Fast Etherchannel logical interface.
  • Can only handle 24 concurrent URLs, hosts or MIME types.
  • Only analyzes the first 400 bytes of a packet.
  • Only supports CEF.

Commands to implement NBAR:

! Turn on CEF
ip cef
!
! Load the bittorrent.pdlm from flash:
ip nbar pdlm flash:bittorrent.pdlm
!
! Match any protocol listed below.
class-map match-any cmap-nbar-drop
 match protocol edonkey
 match protocol gnutella
 match protocol fasttrack
 match protocol kazaa2
 match protocol http url "*cmd.exe*"
 match protocol novadigm
 match protocol bittorrent
!
! Make a policy map.
policy-map pmap-nbar-drop
 class cmap-nbar-drop
   drop
!
! Apply it to an interface.
interface GigabitEthernet0/1/0
 description LAN Subnet
 ip address 192.168.1.1 255.255.255.0
! This command may not be necessary but for ONT testing purposes use it.
 ip nbar protocol-discovery
!Apply the policy map to incoming traffic.
 service-policy input pmap-nbar-drop
Posted in Routing | Leave a comment

Hacking the 7926G Getting Started

The past month I have been writing midlets for the 7926G Cisco IP phone. The interesting aspect of this phone for the hospital is that it has a barcode scanner built into the end of the phone. All of our nurses carry IP phones with them today and we are planning to replace all of them in the next six months, so a phone that doubles as a barcode scanner eliminates carrying an extra device. The whole process of getting a phone and a test CUCM took three weeks, but our Cisco account team came through whenever we needed.

In this post I would like to walk you through the set up of my development environment. First a list of links that will help get you started. Some will be referenced through out this post, others are just for education.

7926 Development Getting Started site

7926 Emulator Installation Guide

7926 Development Firmware Downloads

7926 SDK Download

Java MIDlet Developers Guide for Cisco Unified IP Phones

NetBeans IDE

If you are using Mac OSX then the installation guide above is not correct for the current NetBeans download. This is the appropriate directory.

superfly:Java_ME_platform_SDK_3.0 jud$ pwd
/Applications/NetBeans/NetBeans 7.1.1.app/Contents/Resources/NetBeans/mobility/Java_ME_platform_SDK_3.0

Below you can see that I put the Cisco 7926 skin right in that directory and unzipped. Everything will be put in the correct place if you do it from there.

superfly:Java_ME_platform_SDK_3.0 jud$ ls -1
Cisco7926Skin.zip
CiscoWirelessPhone.jar
DeviceSpecifics
ImageDemo
KeyCodeAPI
MockScannerAPI
ScannerSample
Updater.app
apps
bin
device-manager.app
docs
legal
lib
on-device
runtimes
toolkit-lib

Below I am testing the skin installation, notice how everything falls into place nicely.

superfly:Java_ME_platform_SDK_3.0 jud$ unzip -t Cisco7926Skin.zip
Archive:  Cisco7926Skin.zip
    testing: toolkit-lib/devices/     OK
    testing: toolkit-lib/devices/Cisco7926/   OK
    testing: toolkit-lib/devices/Cisco7926/conf/   OK
    testing: toolkit-lib/devices/Cisco7926/conf/Cisco7926.properties   OK
    testing: toolkit-lib/devices/Cisco7926/conf/device.properties   OK
    testing: toolkit-lib/devices/Cisco7926/conf/modules   OK
    testing: toolkit-lib/devices/Cisco7926/conf/networkIndicatorOn.png   OK
    testing: toolkit-lib/devices/Cisco7926/conf/skin-7926-highlighted.png   OK
    testing: toolkit-lib/devices/Cisco7926/conf/skin-7926-normal.png   OK
    testing: toolkit-lib/devices/Cisco7926/conf/skin-7926-pressed.png   OK
    testing: toolkit-lib/devices/Cisco7926/default-state.xml   OK
    testing: toolkit-lib/devices/Cisco7926/NetworkIndicator.bean   OK
    testing: toolkit-lib/devices/Cisco7926/Screen.bean   OK
    testing: toolkit-lib/devices/Cisco7926/ScreenGraphics.bean   OK
    testing: toolkit-lib/process/     OK
    testing: toolkit-lib/process/device-manager/   OK
    testing: toolkit-lib/process/device-manager/conf/   OK
    testing: toolkit-lib/process/device-manager/conf/Cisco7926.properties   OK
    testing: toolkit-lib/process/device-manager/device-adapter/   OK
    testing: toolkit-lib/process/device-manager/device-adapter/Cisco7926.bean   OK
    testing: toolkit-lib/process/device-manager/device-adapter/Cisco7926.deviceProperties.bean   OK
    testing: toolkit-lib/process/device-manager/device-adapter/Cisco7926/   OK
    testing: toolkit-lib/process/device-manager/device-adapter/Cisco7926/1.bean   OK
No errors detected in compressed data of Cisco7926Skin.zip.

When you get everything installed correctly the option for a Cisco7926 is in the configuration menu.

7926g-configuration-menu

Posted in Code, Routing | Leave a comment