Finished 2010

I just finished adding the posts form 2010 with a smattering of other random posts. I am tired of the repetitive nature of adding posts and am going to get in some studying today.

Posted in Thoughts | Leave a comment

iTerm2 with VIRL

Yesterday my boss purchased VIRL for me. I run it on one of my ESXi servers and wanted to be able to maintain the environment I have gotten used to running, namely tabbed terminals in iTerm2. Easier said than done.

I used a couple of blogs to help get me going, but their scripts did not work for me, leading to this post.  I hope it will help others.

These are my settings under the Preferences pane of VM Maestro.screen-shot-2016-12-20-at-7-16-48-pm

Below is the script that finally worked for me. Please note, the delay 2 is one of the biggest changes I had to make, otherwise the script would put both telnet commands into the same tab.

-- 2016-12-20
-- Jud Bishop

on run argv
	
	-- last argument should be the window title
	set windowtitle to item (the count of argv) of argv as text
	
	-- all but last argument go into CLI parameters
	set cliargs to ""
	repeat with arg in items 1 thru -2 of argv
		set cliargs to cliargs & " " & arg as text
	end repeat
	
	delay 2
	
	tell application "iTerm"
		activate
		set the bounds of the first window to {1000, 500, 1900, 1200}
		if (count of windows) = 0 then
			set t to (create window with default profile)
		else
			set t to current window
		end if
		tell t
			create tab with default profile
			set s to current session
			tell s
				write text cliargs
				set name to windowtitle
			end tell
		end tell
	end tell
end run

Posted in CCIE, Routing | Leave a comment

C360 or Cisco Expert-Level Training Script

 

I’ve been using the C360 labs for training and have gotten tired of fighting with window management. At first I figured it would be good practice for the lab, but now I’m just tired of messing with windows, so I wrote a script to help alleviate my annoyance.

The server and port range for the lab you work on changes every time you run a different lab, so the script had to take that into account.  If you pay attention the ports you connect to are sequential, so just check the port and server of R1 and you have everything to you need.

At the end of the script it puts the iTerm windows in the lower right-hand corner of my left screen so I don’t have to move it by hand. 😉

Here are the pop-ups from the script:

screen-shot-2016-12-04-at-11-41-02-am

screen-shot-2016-12-04-at-11-40-51-am

 

-- 2016-12-04 
-- Jud Bishop 

set SERVER to the text returned of (display dialog ¬
	"Enter the server IP:" default answer "10.10.1.100")

set FIRSTPORT to the text returned of (display dialog ¬
	"Enter the first port:" default answer "11501")


tell application "iTerm"
	set netWindow to (create window with default profile)
	
	select first window
	
	set DEVICES to {"R1", "R2", "R3", "R4", "R5", "R6", "SW1", "SW2", "SW3", "SW4", "R7", "R8", "R9", "BB"}
	repeat with I from 1 to 14
		tell current window
			set newTab to (create tab with default profile)
			tell current session
				write text "telnet " & SERVER & " " & FIRSTPORT
				set name to item I of DEVICES
				set FIRSTPORT to FIRSTPORT + 1
			end tell
		end tell
	end repeat
	set the bounds of the first window to {1000, 500, 1900, 1200}
end tell

 

Posted in CCIE, Code, Routing | Leave a comment

Finished 2009

I finished adding the posts from 2009, will start on 2010 soon.

Posted in Uncategorized | Leave a comment

Resurrecting the Site

Last year my blog got infected with a virus and rather than pay my hosting provider to clean it, I decided to turn it off.  Unfortunately I used to use it for reference because I knew I had some random process documented.  Some of the guys at work also had it bookmarked to reference historical articles about some custom piece of code or full stack site that I created.  I also wanted to start writing again to document my studies.

I took a backup of the site from my old hosting provider, but it would not import into WordPress.  I also had a DB dump of the site, so I wrote this program to extract my old posts.  I will slowly re-post my old articles.  Some posts may look good, others will not.  While I have most of my old graphics I do not have all of them, so some may be missing bits and pieces.  I apologize for that.

The process to get my old posts back was to first understand the DB format, then decide how to go about it.  I actually wrote a couple of different scripts, one that dumped all of the posts with date, title and content into one text file.  The second one was to put each article into their own file.  This made it easier to figure out each post.

One thing that I am still struggling with is the formatting for code. Formatting is so important for code, yet I am still learning how this new interface formats and am fighting to make the code look good. I am loosing the battle, but as I continue to work with this editor I hope to eventually win the war.


#!/usr/bin/perl

# 2016-11-12
# Jud Bishop

#use chainrin_wrd01;
#describe wp_posts;
#select ID from wp_posts;
#select post_date from wp_posts where id=128;
#select post_date, post_title, post_content from wp_posts where id=128;

#Database changed
#MariaDB [chainrin_wrd01]> describe wp_posts;
#+-----------------------+---------------------+------+-----+---------------------+----------------+
#| Field | Type | Null | Key | Default | Extra |
#+-----------------------+---------------------+------+-----+---------------------+----------------+
#| ID | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
#| post_author | bigint(20) unsigned | NO | MUL | 0 | |
#| post_date | datetime | NO | | 0000-00-00 00:00:00 | |
#| post_date_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
#| post_content | longtext | NO | | NULL | |
#| post_title | text | NO | | NULL | |
#| post_excerpt | text | NO | | NULL | |
#| post_status | varchar(20) | NO | | publish | |
#| comment_status | varchar(20) | NO | | open | |
#| ping_status | varchar(20) | NO | | open | |
#| post_password | varchar(20) | NO | | | |
#| post_name | varchar(200) | NO | MUL | | |
#| to_ping | text | NO | | NULL | |
#| pinged | text | NO | | NULL | |
#| post_modified | datetime | NO | | 0000-00-00 00:00:00 | |
#| post_modified_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
#| post_content_filtered | longtext | NO | | NULL | |
#| post_parent | bigint(20) unsigned | NO | MUL | 0 | |
#| guid | varchar(255) | NO | | | |
#| menu_order | int(11) | NO | | 0 | |
#| post_type | varchar(20) | NO | MUL | post | |
#| post_mime_type | varchar(100) | NO | | | |
#| comment_count | bigint(20) | NO | | 0 | |
#+-----------------------+---------------------+------+-----+---------------------+----------------+
#

use strict;
use warnings;
use DBI;

my $dbh;
my $sql;
my $sth;
my $fh; #file handle

sub dbi_connect {
 $dbh = DBI->connect('dbi:mysql:dbname=chainrin_wrd01;host=127.0.0.1','chainring','',{AutoCommit=>1,RaiseError=>1,PrintError=>1}) || die "Error connecting: '$DBI::errstr'";
}

sub dbi_disconnect{
      $sth->finish;
      $dbh->disconnect;
}

sub sql_prepare {
     print "$sql\n";
     $sth = $dbh->prepare($sql) || die "Error preparing: $DBI::errstr";
}

sub sql_table_print {

my $result = $sth->execute || die "Error executing: $DBI::errstr";

# HEADER
 print "Field names: @{ $sth->{NAME} }\n";

# DATA
 while (my @data = $sth->fetchrow_array()) {
 my $date = $data[0];
 $date =~ s/\r//g;
 my $title = $data[1];
 $title =~ s/\r//g;
 my $content = $data[2];
 $content =~ s/\r//g;

# It's not pretty, but it's legible.
 my $filename = $title;
 $filename =~ s/\ /-/g;
 $filename =~ s/:/-/g;
 $filename =~ s/\>/-/g;
 $filename =~ s/\</-/g;
 $filename =~ s/\//-/g;

 open_file($filename);
   print $fh "$date\n";
   print $fh "$title\n";
   print $fh "$content\n";
   print $fh "\n";
 close_file();
 }

}

sub open_file {
 print "open_file\n";
 my $filename = shift;
 if ($filename eq ''){ $filename = "filename"; }
 print "$filename\n";
 $filename = "/tmp/Posts/" . $filename;
 open($fh, '>', $filename) || die "Unable to open file: $!";
}

sub close_file {
 close ($fh) || die "Unable to close file: $!";
}

# Main
dbi_connect();
$sql = "select post_date, post_title, post_content from wp_posts";
sql_prepare();
sql_table_print();
dbi_disconnect();

 

Posted in Uncategorized | Leave a comment

Hello World

Hello world!

Posted in Uncategorized | Leave a comment

ELK Stack

My basic configuration.

rsyslog –> Logstash –> Elasticsearch –> Kibana with GeoIP

At this time I do not plan to proxy Kibana with Nginx. The goal here is to keep the setup and configuration as simple as possible.

Versions:
CentOS 7
Logstash 1.4.2
Kibana 4.0.2
ElasticSearch 1.5.0

This is the fourth time I have set this up and each time I have fought through different errors. My goal is to more thoroughly document the process so that I can recreate it easily the next time.

1. Turn off SE Linux. I fought app armor on Ubuntu doing one of these installs and I am not going to fight SE Linux. Reboot.

shutdown -r now

Check if selinux is on or off.

sestatus

2. Turn off the firewall.

systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

3. Update.

yum update
yum upgrade

4. Configure rsyslog to accept remote logging and name the file appropriately. This is the way we have named our log files and we have a number of scripts that work with them, so they are not changing.

cat /etc/rsyslog.conf | grep -v ^# | egrep -v ^$
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imjournal # provides access to the systemd journal
$ModLoad imklog # reads kernel messages (the same are read from journald)
$ModLoad immark # provides --MARK-- message capability
$ModLoad imudp
$UDPServerRun 514
$ModLoad imtcp
$InputTCPServerRun 514
$WorkDirectory /var/lib/rsyslog
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$IncludeConfig /etc/rsyslog.d/*.conf
$OmitLocalLogging on
$IMJournalStateFile imjournal.state
*.info;mail.none;authpriv.none;cron.none /var/log/messages
authpriv.* /var/log/secure
mail.* -/var/log/maillog
cron.* /var/log/cron
*.emerg :omusrmsg:*
uucp,news.crit /var/log/spooler
local7.* /var/log/boot.log
$template DynaFile,"/var/log/remote-%fromhost-ip%.log"
*.* -?DynaFile

The last two lines are the ones that count.

$template DynaFile,"/var/log/remote-%fromhost-ip%.log"
*.* -?DynaFile

Restart rsyslog.

systemctl restart rsyslog

5. Install Java and some other tools you might need later.

yum install -y java-1.7.0-openjdk lsof wget rubygems

6. Install the GPG for the ElasticSearch repository and install it but don’t start it yet.

Add the Elasticsearch repository.

cat /etc/yum.repos.d/elasticsearch.repo

[elasticsearch-1.5]
name=Elasticsearch repository for 1.5.x packages
baseurl=http://packages.elasticsearch.org/elasticsearch/1.5/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

And install.

yum update
yum -y install elasticsearch

7. Change your Elasticsearch configuration to suit your needs. I had trouble with Elasticsearch and Logstash only listening on IPv6. As a result I had to explicitly set the IP address as seen below. We also have another Elasticsearch cluster and I don’t want the two talking so I turned off multicast discovery. This is not a problem if you edit your configs first, but I had a terrible time because they were named the same.

cat /etc/elasticsearch/elasticsearch.yml | egrep -v "^#|^$"
cluster.name: logserver
node.name: "elk"
node.master: true
node.data: true
node.max_local_storage_nodes: 1
network.bind_host: 172.22.225.76
network.host: 172.22.225.76
discovery.zen.ping.multicast.enabled: false

If you have problems with ElasticSearch not listening on IPv4.
In the file /usr/share/elasticsearch/bin/elasticsearch.in.sh

You see the lines:

# Force the JVM to use IPv4 stack
if [ "x$ES_USE_IPV4" != "x" ]; then
     JAVA_OPTS="$JAVA_OPTS -Djava.net.preferIPv4Stack=true"
fi

So in the /etc/sysconfig/elasticsearch file add:

# Tell ES to use IPv4
ES_USE_IPV4=true

Here is my entire syconfig file for Elasticsearch.

cat /etc/sysconfig/elasticsearch | egrep -v "^#|^$"
ES_HOME=/usr/share/elasticsearch
MAX_OPEN_FILES=65535
MAX_MAP_COUNT=262144
LOG_DIR=/var/log/elasticsearch
DATA_DIR=/var/lib/elasticsearch
WORK_DIR=/tmp/elasticsearch
CONF_DIR=/etc/elasticsearch
CONF_FILE=/etc/elasticsearch/elasticsearch.yml
ES_USER=elasticsearch
ES_USE_IPV4=true

These are good examples from the Elasitcsearch configuration file:

# Use the Cluster Health API [http://localhost:9200/_cluster/health], the
# Node Info API [http://localhost:9200/_nodes] or GUI tools
# such as &lt;http://www.elasticsearch.org/overview/marvel/
# http://github.com/karmi/elasticsearch-paramedic
# http://github.com/lukas-vlcek/bigdesk and
# http://mobz.github.com/elasticsearch-head to inspect the cluster state.

8. Start Elasticsearch.

systemctl enable elasticsearch
systemctl start elasticsearch

9. Test it.

curl http://172.22.225.76:9200
{
  "status" : 200,
  "name" : "elk",
  "cluster_name" : "logserver",
  "version" : {
    "number" : "1.5.1",
    "build_hash" : "5e38401bc4e4388537a615569ac60925788e1cf4",
    "build_timestamp" : "2015-04-09T13:41:35Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

Other testing examples, some will not work now as we have not finished the Logstash configuration.

curl -XGET http://172.22.225.76:9200/_cluster/health
curl http://172.22.225.76:9200
curl http://172.22.225.76:9200/_status?pretty=true
curl http://172.22.225.76:9200/_search?q=type:syslog&amp;pretty=true

If you need to debug Elasticsearch here is the strace command I used.

strace java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.pidfile=/var/run/elasticsearch/elasticsearch.pid -Des.path.home=/usr/share/elasticsearch -cp :/usr/share/elasticsearch/lib/elasticsearch-1.5.0.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* -Des.default.config=/etc/elasticsearch/elasticsearch.yml -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch -Des.default.path.data=/var/lib/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch org.elasticsearch.bootstrap.Elasticsearch -Djava.net.preferIPv6Addresses=false

10. LogStash installation, first enable the repository.

cat /etc/yum.repos.d/logstash.repo
[logstash-1.5]
name=logstash repository for 1.5.x packages
baseurl=http://packages.elasticsearch.org/logstash/1.5/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

Then install Logstash.

yum install -y logstash

I also had some trouble getting Logstash to run properly. It was missing a directory and only listening on IPv6. The first thing I has do was a make specific directory.

mkdir /var/run/nscd

The second thing I had to do was add a Java option in the sysconfig directory.

cat /etc/sysconfig/logstash | egrep -v "^#|^$"
LS_JAVA_OPTS="-Djava.net.preferIPv4Stack=true"

If you have trouble with logstash, here is the strace command I ran to debug it.

strace -o /tmp/strace.log -fe trace=network /opt/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf

Below is the error where I had to create the directory to fix.

21790 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
21790 connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
21790 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
21790 connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)

Here is my very simple logstash.conf file.

input {
	syslog {
		debug => true
		type => syslog
		port => 5140
	}
}

filter {
}

output {
	stdout { }
	elasticsearch {
		cluster => "logserver"
		host => "172.22.225.76"
	}
}

Turn on logstash and start it.

chkconfig logstash on
/etc/init.d/logstash start

I pointed a firewall at this ELK server, and now you should see messages coming into Elasticsearch from Logstash.

curl http://172.22.225.76:9200/_search?q=type:syslog&amp;pretty=true

11. Install Kibana

Download the gzipped tar file.

wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.2-linux-x64.tar.gz

Extract it.

tar -xvzf kibana-4.0.2-linux-x64.tar.gz

Make a directory for Kibana and copy the unzipped contents into it.

mkdir /opt/kibana
cp -R kibana-4.0.2-linux-x64/* /opt/kibana/

Here is my basic Kibana configuration file.

cat /opt/kibana/config/kibana.yml | egrep -v "^#|^$"
port: 5601
host: "172.22.225.76"
elasticsearch_url: "http://172.22.225.76:9200"
elasticsearch_preserve_host: true
kibana_index: ".kibana"
default_app_id: "discover"
request_timeout: 300000
shard_timeout: 0
verify_ssl: true
bundled_plugin_ids:
- plugins/dashboard/index
- plugins/discover/index
- plugins/doc/index
- plugins/kibana/index
- plugins/markdown_vis/index
- plugins/metric_vis/index
- plugins/settings/index
- plugins/table_vis/index
- plugins/vis_types/index
- plugins/visualize/index

Set it up ready to run.

gem install pleaserun
/usr/local/bin/pleaserun --platform systemd --install /opt/kibana/bin/kibana

Enable Kibana and start it.

systemctl enable kibana
systemctl start kibana

Go and check it out by pointing your browser to:

http://elk.chainringcircus.org:5601.

12. Install GeoLite

Here is the website for more information:

http://dev.maxmind.com/geoip/legacy/geolite/

Download the city database.

wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz

I put the database in my /opt/logstash directory but there may be other directories that make more sense.

gunzip GeoLiteCity.dat.gz
mv GeoLiteCity.dat /opt/logstash

13. Here is my /etc/logstash/conf.d/asa.conf configuration file. You will notice that I tag each interface as inside_out or outside_in. That is so that we can track who we are blocking from the inside as well as who we are blocking from the outside. That also lets us create a map for each type of traffic for visualization purposes.

cat /etc/logstash/conf.d/asa.conf
input {
	file {
		path => ["/var/log/remote-192.168.2.250.log"]
		sincedb_path => "/var/log/logstash/since.db"
		start_position => "beginning"
		type => "asa"
	}
	file {
		path => ["/var/log/remote-192.168.2.254.log"]
		sincedb_path => "/var/log/logstash/since.db"
		start_position => "beginning"
		type => "asa"
	}
}

# begin filter block
filter {
	if [type] == "asa" { # begin ASA block
		grok {
			match => ["message", "%{CISCO_TAGGED_SYSLOG} %{GREEDYDATA:cisco_message}"]
		}

		# Parse the syslog severity and facility
    		syslog_pri { }

    		# Parse the date from the "timestamp" field to the "@timestamp" field
    		date {
      			match => ["timestamp",
        			"MMM dd HH:mm:ss",
        			"MMM  d HH:mm:ss",
        			"MMM dd yyyy HH:mm:ss",
        			"MMM  d yyyy HH:mm:ss"
      				]
      			timezone => "America/New_York"
    		}

		# Clean up redundant fields if parsing was successful
    		if "_grokparsefailure" not in [tags] {
      			mutate {
        			rename => ["cisco_message", "message"]
        			remove_field => ["timestamp"]
      			}
    		}

		# Extract fields from the each of the detailed message types
    		# The patterns provided below are included in Logstash since 1.2.0
    		grok {
      			match => [
        			"message", "%{CISCOFW106001}",
       		 		"message", "%{CISCOFW106006_106007_106010}",
        			"message", "%{CISCOFW106014}",
        			"message", "%{CISCOFW106015}",
	        		"message", "%{CISCOFW106021}",
        			"message", "%{CISCOFW106023}",
        			"message", "%{CISCOFW106100}",
        			"message", "%{CISCOFW110002}",
	        		"message", "%{CISCOFW302010}",
        			"message", "%{CISCOFW302013_302014_302015_302016}",
        			"message", "%{CISCOFW302020_302021}",
        			"message", "%{CISCOFW305011}",
	        		"message", "%{CISCOFW313001_313004_313008}",
        			"message", "%{CISCOFW313005}",
        			"message", "%{CISCOFW402117}",
        			"message", "%{CISCOFW402119}",
	        		"message", "%{CISCOFW419001}",
        			"message", "%{CISCOFW419002}",
        			"message", "%{CISCOFW500004}",
        			"message", "%{CISCOFW602303_602304}",
	        		"message", "%{CISCOFW710001_710002_710003_710005_710006}",
        			"message", "%{CISCOFW713172}",
        			"message", "%{CISCOFW733100}"
      			]
    		}

		# GeoIP for the ASA
		# Source
		if [src_interface] == "Outside" {
 			geoip {
    				source => "src_ip"
	    			target => "geoip"
    				database =>"/opt/logstash/GeoLiteCity.dat"
    				add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
    				add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
  			}

	  		mutate {
    				convert => [ "[geoip][coordinates]", "float" ]
				add_tag => [ 'outside_in' ]
  			}
		} # end source block

		# Destination
		##if [src_interface] !~ "(Inside)|(Bypass)" {
		if [src_interface] == "Inside" {
 			geoip {
    				source => "dst_ip"
	    			target => "geoip"
    				database =>"/opt/logstash/GeoLiteCity.dat"
    				add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
    				add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
  			}

	  		mutate {
    				convert => [ "[geoip][coordinates]", "float" ]
				add_tag => [ 'inside_out' ]
  			}
		}
	} # end ASA
} # end filter block

output {
	stdout { }
	elasticsearch {
		cluster => "logserver"
		host => "172.22.225.76"
	}
}

14. Some final thoughts and hints.

Here are some commands I used for debugging purposes.

lsof -i4
lsof -i6
netstat -l
curl -XGET http://172.22.225.76:9200/_cluster/health

Last is a helper script I created.

cat /usr/local/bin/elkstack-restart
#!/bin/bash
systemctl restart elasticsearch
systemctl restart logstash
systemctl restart kibana

Sources in no particular order:
http://www.networkassassin.com/elk-stack-for-network-operations-reloaded/
https://community.ulyaoth.net/threads/how-to-create-a-logstash-geoip-based-dashboard-in-kibana-3.29/
http://kartar.net/2014/09/when-logstash-and-syslog-go-wrong/
http://blog.domb.net/?p=367
http://grokdebug.herokuapp.com/patterns#
http://blog.stevenmeyer.co.uk/2014/06/add-configuration-test-to-logstash-service-configtest.html
http://www.itzgeek.com/how-tos/linux/centos-how-tos/how-do-i-disable-ipv6-on-centos-7-rhel-7.html#axzz3WjuY2oLh
http://slacklabs.be/2012/04/02/force-Elastic-search-on-ipv4-debian/

Posted in Linux | Leave a comment

Linux iSCSI Multipath on NetApp

Linux iSCSI multipath on NetApp

Install the iSCSI and multipath rpms.

yum install iscsi-initiator-utils
yum install device-mapper-multipath

A little note about initiator naming. I tried to change mine to be similar to how we set up targets on Linux and fumbled around for hours. When I finally moved the original initiatorname.iscsi file back, everything started working. Just leave it alone.

cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1994-05.com.redhat:e178b44934eb

This NetApp link has settings for the iSCSI setup on Debian. I used most of their settings, but once I got it working I did not make further changes.

cat /etc/iscsi/iscsid.conf  | grep -v \# | sed /^$/d
iscsid.startup = /etc/rc.d/init.d/iscsid force-start
node.startup = automatic
node.leading_login = No
node.session.timeo.replacement_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.xmit_thread_priority = -20
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.conn[0].iscsi.HeaderDigest = None
node.session.nr_sessions = 1
node.session.iscsi.FastAbort = Yes

Turn on iSCSI daemon.

service iscsid start
chkconfig iscsid on

Check to make sure iSCSI is set to come on.

chkconfig iscsid --list
iscsid         	0:off	1:off	2:off	3:on	4:on	5:on	6:off

What iSCSI targets are advertised?

iscsiadm -m discovery -t st -p 172.22.251.11
172.22.251.11:3260,2000 iqn.1992-08.com.netapp:sn.1575046996
172.22.200.96:3260,2001 iqn.1992-08.com.netapp:sn.1575046996
172.22.225.32:3260,2002 iqn.1992-08.com.netapp:sn.1575046996

Here is my network setup. On 172.22.251.0/24 I have two interfaces that are not routed for the iSCSI connections.

ifconfig
eth0      Link encap:Ethernet  HWaddr 00:10:18:78:66:4C
          inet addr:172.22.251.10  Bcast:172.22.251.255  Mask:255.255.255.0
          inet6 addr: fe80::210:18ff:fe78:664c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:56716 errors:0 dropped:0 overruns:0 frame:0
          TX packets:50155 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:30259444 (28.8 MiB)  TX bytes:4528192 (4.3 MiB)

eth2      Link encap:Ethernet  HWaddr 00:1E:C9:EA:C0:55
          inet addr:172.22.100.53  Bcast:172.22.100.255  Mask:255.255.255.0
          inet6 addr: fe80::21e:c9ff:feea:c055/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1338563 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3185 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:199985983 (190.7 MiB)  TX bytes:637730 (622.7 KiB)
          Interrupt:16

eth3      Link encap:Ethernet  HWaddr 00:1E:C9:EA:C0:56
          inet addr:172.22.251.76  Bcast:172.22.251.255  Mask:255.255.255.0
          inet6 addr: fe80::21e:c9ff:feea:c056/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18673 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2084406 (1.9 MiB)  TX bytes:532 (532.0 b)
          Interrupt:17

I have three choices, and because I want to use the 172.22.251.0/24 subnet. So I log in and make it persistent.

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.251.11 -l
iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.251.11 --op update -n node.startup -v automatic

See that I actually have three sessions.

iscsiadm -m session
tcp: [1] 172.22.200.96:3260,2001 iqn.1992-08.com.netapp:sn.1575046996
tcp: [2] 172.22.225.32:3260,2002 iqn.1992-08.com.netapp:sn.1575046996
tcp: [3] 172.22.251.11:3260,2000 iqn.1992-08.com.netapp:sn.1575046996

Now I have three sessions attached, because of the different IP addresses. I want to log out of two of the sessions.

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.200.96 -u
Logging out of session [sid: 1, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.200.96,3260]
Logout of [sid: 1, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.200.96,3260] successful.

And log out of the other session.

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.225.32 -u
Logging out of session [sid: 2, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.225.32,3260]
Logout of [sid: 2, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.225.32,3260] successful.

Ensure that these two sessions do not come back.

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.200.96 --op update -n node.startup -v manual
iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.225.32 --op update -n node.startup -v manual

Test to make sure everything is correct.

shutdown -r now

This is for debug output.

 
iscsiadm -m session -P[0-3]

What drives do we see now?

fdisk -l

Disk /dev/sda: 79.5 GB, 79456894976 bytes
255 heads, 63 sectors/track, 9660 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000141ea

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          64      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              64        9661    77081600   8e  Linux LVM

Disk /dev/mapper/vg_sannfsr11-lv_root: 53.7 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_sannfsr11-lv_swap: 4160 MB, 4160749568 bytes
255 heads, 63 sectors/track, 505 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_sannfsr11-lv_home: 21.1 GB, 21080571904 bytes
255 heads, 63 sectors/track, 2562 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdc: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000


Disk /dev/sdd: 104 MB, 104857600 bytes
4 heads, 50 sectors/track, 1024 cylinders
Units = cylinders of 200 * 512 = 102400 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 65536 bytes
Disk identifier: 0x00000000

Troubleshoot iSCSI
I found this site that explains the iscsiadm commands nicely. I do not want to take credit for this, I just want to make sure that is easily accessible to me.

Discover available targets from a discovery portal
iscsiadm -m discovery -t sendtargets -p ipaddress

iscsiadm -m discovery -t st -p 172.22.251.11
172.22.251.11:3260,2000 iqn.1992-08.com.netapp:sn.1575046996
172.22.200.96:3260,2001 iqn.1992-08.com.netapp:sn.1575046996
172.22.225.32:3260,2002 iqn.1992-08.com.netapp:sn.1575046996

Log into a specific target.
iscsiadm -m node -T targetname -p ipaddress -l

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.251.11 -l

Log out of a specific target.
iscsiadm -m node -T targetname -p ipaddress -u

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.225.32 -u
Logging out of session [sid: 2, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.225.32,3260]
Logout of [sid: 2, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.225.32,3260] successful.

Display information about a target.
iscsiadm -m node -T targetname -p ipaddress

iscsiadm -m node -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.251.11
# BEGIN RECORD 6.2.0-873.2.el6
node.name = iqn.1992-08.com.netapp:sn.1575046996
node.tpgt = 2000
node.startup = automatic
node.leading_login = No
iface.hwaddress = <empty>
... output removed for brevity ...

Display statistics about a target.
iscsiadm -m node -s -T targetname -p ipaddress

iscsiadm -m node -s -T iqn.1992-08.com.netapp:sn.1575046996 -p 172.22.251.11
Stats for session [sid: 1, target: iqn.1992-08.com.netapp:sn.1575046996, portal: 172.22.251.11,3260]
iSCSI SNMP:
	txdata_octets: 1348576
	rxdata_octets: 35674144
	noptx_pdus: 0
	scsicmd_pdus: 15527
	tmfcmd_pdus: 0
	login_pdus: 0
	text_pdus: 0
	dataout_pdus: 0
	logout_pdus: 0
	snack_pdus: 0
	noprx_pdus: 0
	scsirsp_pdus: 15525
	tmfrsp_pdus: 0
	textrsp_pdus: 0
	datain_pdus: 15455
	logoutrsp_pdus: 0
	r2t_pdus: 0
	async_pdus: 0
	rjt_pdus: 0
	digest_err: 0
	timeout_err: 0
iSCSI Extended:
	tx_sendpage_failures: 0
	rx_discontiguous_hdr: 0
	eh_abort_cnt: 0

Display list of all current sessions logged in.
iscsiadm -m session

iscsiadm -m session
tcp: [1] 172.22.251.11:3260,2000 iqn.1992-08.com.netapp:sn.1575046996

View iSCSI database regarding discovery
iscsiadm -m discovery -o show

iscsiadm -m discovery -o show
172.22.251.11:3260 via sendtargets

View iSCSI database regarding targets to log into
iscsiadm -m node -o show

iscsiadm -m node -o show
# BEGIN RECORD 6.2.0-873.2.el6
node.name = iqn.1992-08.com.netapp:sn.1575046996
node.tpgt = 2000
node.startup = automatic
node.leading_login = No
iface.hwaddress = <empty>
iface.ipaddress = <empty>
.. output removed for brevity ...

View iSCSI database regarding sessions logged into
iscsiadm -m session -o show

iscsiadm -m session -o show
tcp: [1] 172.22.251.11:3260,2000 iqn.1992-08.com.netapp:sn.1575046996

Multipath Configuration
From /etc/multipath.conf

blacklist {
devnode "^hd[a-z]"
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^cciss.*"
}

What are my WWIDs for each LUN? I know from above that sda is my internal drive while sdc and sdd are my iSCSI targets.

scsi_id -g -u /dev/sda
3600508e0000000006f508d7a67af8d03

scsi_id -g -u /dev/sdc
360a98000375435454b24426264574552

scsi_id -g -u /dev/sdd
360a98000375435454b24426264574554

So I want to exclude /dev/sda[*]. I did it using both the WWID and a regular expression. Please note that the getuid_callout and the prio_callout/prio are different that what came in the file from Red Hat.

Because user_friendly_names is set to yes I also added aliases in the multipaths section below.

cat /etc/multipath.conf
defaults {
        max_fds                 4096
        user_friendly_names     yes
}
#}
# All data under blacklist must be specific to your system.
blacklist {
	wwid 3600508e0000000006f508d7a67af8d03
	devnode "^hd[a-z]"
	devnode "^sd[a]$"
	devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
	devnode "^cciss.*"
}
#
devices {
        device {
                vendor                  "NETAPP"
                product                 "LUN"
                #getuid_callout          "/sbin/scsi_id -g -u /block/%n"
                #prio_callout            "/sbin/mpath_prio_alua /dev/%n"
                getuid_callout		"/lib/udev/scsi_id --whitelisted --device=/dev/%n"
                prio			ontap
                features                "1 queue_if_no_path"
                hardware_handler        "0"
                path_selector           "round-robin 0"
                path_grouping_policy    multibus
                failback                immediate
                rr_weight               uniform
                rr_min_io               128
                path_checker            directio
		flush_on_last_del       yes
        }
}

multipaths {
	multipath {
		wwid 360a98000375435454b24426264574552
		alias test1
	}
	multipath {
		wwid 360a98000375435454b24426264574554
		alias test2
	}
}

Fire up multipath.

service multipathd restart
chkconfig multipathd on
chkconfig --list multipathd
multipathd     	0:off	1:off	2:on	3:on	4:on	5:on	6:off

Now to check for multipathing.

multipath -ll
test2 (360a98000375435454b24426264574554) dm-4 NETAPP,LUN
size=100M features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=2 status=active
  `- 8:0:0:1 sdd 8:48 active ready  running
test1 (360a98000375435454b24426264574552) dm-3 NETAPP,LUN
size=100M features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=2 status=active
  `- 8:0:0:0 sdc 8:32 active ready  running

Troubleshooting Multipath
You can run multipathd from the command line to get an interactive shell. Use ? or help to get a listing of available commands.

multipathd -k
multipathd> show config
...output removed for brevity...

multipathd> show paths
hcil    dev dev_t pri dm_st  chk_st dev_st  next_check
8:0:0:0 sdc 8:32  2   active ready  running XXXXX..... 10/20
8:0:0:1 sdd 8:48  2   active ready  running XXX....... 7/20

multipathd> show status
path checker states:
down                1
up                  2

multipathd> paths count
Paths: 2
Busy: False

multipathd> show maps
name  sysfs uuid
test1 dm-3  360a98000375435454b24426264574552
test2 dm-4  360a98000375435454b2442626457455

Here is a list of very helpful article that covers similar commands, etc.
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/#mpio_configfile
http://kbase.redhat.com/faq/docs/DOC-6388
https://library.netapp.com/ecmdocs/ECMP1217221/html/GUID-195E70DE-4E57-411D-BDE2-36AE635AFDBC.html
http://linux.netapp.com/docs/debian/iscsi-multipath-configuration-guide
https://library.netapp.com/ecm/ecm_get_file/ECMP1217221

Click to access rhel5-iscsi-HOWTO.pdf

Posted in Linux, Uncategorized | Leave a comment

Fighting Multipath

2013-11-19 08:44:59

[root@chevelle ~]# cat /etc/multipath/bindings
# Multipath bindings, Version : 1.0
# NOTE: this file is automatically maintained by the multipath program.
# You should not need to edit this file in normal circumstances.
#
# Format:
# alias wwid
#
mpath0 36a4badb021d20600133389a784a85226
mpath1 36a4badb000291e140000064f3aa78999
mpath2 36a4badb0002b75c6000006334bea77a3
mpath3 36a4badb000291e14000006983aa7b02e
mpath4 36a4badb0002b75c6000006364bea78a5
mpath5 36a4badb021d32c00132d9598938212dc
mpath6 36a4badb000291e1400001704411a6737
mpath7 36a4badb000291e1400001700411a636d
mpath8 36a4badb000291e1400001706411a67a7
mpath9 36a4badb000291e1400001702411a6421
[root@chevelle ~]# multipath -ll
mpath9 (36a4badb000291e1400001702411a6421) dm-4 DELL,MD3000
[size=136G][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:3 sdj 8:144 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:3 sde 8:64 [active][ghost]
mpath8 (36a4badb000291e1400001706411a67a7) dm-3 DELL,MD3000
[size=136G][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:2 sdi 8:128 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:2 sdd 8:48 [active][ghost]
mpath7 (36a4badb000291e1400001700411a636d) dm-2 DELL,MD3000
[size=10M][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdc 8:32 [active][ghost]
mpath6 (36a4badb000291e1400001704411a6737) dm-1 DELL,MD3000
[size=10M][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac][rw]
\_ round-robin 0 [prio=100][active]
\_ 2:0:0:0 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:0 sdb 8:16 [active][ghost]
[root@chevelle ~]# man mkqdisk
[root@chevelle ~]# mkqdisk -L
mkqdisk v0.6.0
/dev/dm-5:
/dev/mapper/mpath6p1:
/dev/mpath/mpath6p1:
Magic: eb7a62c2
Label: qdisk
Created: Thu Jun 3 18:40:33 2010
Host: chevelle
Kernel Sector Size: 512
Recorded Sector Size: 512

[root@chevelle ~]# fdisk -l

Disk /dev/sda: 146.1 GB, 146163105792 bytes
255 heads, 63 sectors/track, 17769 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 128 1020127+ 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 128 651 4200997+ 82 Linux swap / Solaris
Partition 2 does not end on cylinder boundary.
/dev/sda3 651 17769 137500335 8e Linux LVM

Disk /dev/sdf: 20 MB, 20971520 bytes
64 heads, 32 sectors/track, 20 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdf doesn’t contain a valid partition table

Disk /dev/sdg: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdg1 1 1 8001 83 Linux

Disk /dev/sdh: 10 MB, 10485760 bytes
64 heads, 32 sectors/track, 10 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdh doesn’t contain a valid partition table

Disk /dev/sdi: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdi doesn’t contain a valid partition table

Disk /dev/sdj: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdj doesn’t contain a valid partition table

Disk /dev/sdk: 20 MB, 20971520 bytes
64 heads, 32 sectors/track, 20 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdk doesn’t contain a valid partition table

Disk /dev/dm-1: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/dm-1p1 1 1 8001 83 Linux

Disk /dev/dm-2: 10 MB, 10485760 bytes
255 heads, 63 sectors/track, 1 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-2 doesn’t contain a valid partition table

Disk /dev/dm-3: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-3 doesn’t contain a valid partition table

Disk /dev/dm-4: 146.2 GB, 146267963392 bytes
255 heads, 63 sectors/track, 17782 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-4 doesn’t contain a valid partition table

Disk /dev/dm-5: 8 MB, 8193024 bytes
255 heads, 63 sectors/track, 0 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/dm-5 doesn’t contain a valid partition table
[root@chevelle ~]# mount
/dev/mapper/vg00-root on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/mapper/vg00-var on /var type ext3 (rw)
/dev/mapper/vg00-usr on /usr type ext3 (rw)
/dev/mapper/vg00-usrlocal on /usr/local type ext3 (rw)
/dev/mapper/vg00-home on /home type ext3 (rw)
/dev/mapper/vg00-opt on /opt type ext3 (rw)
/dev/mapper/vg00-tmp on /tmp type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/mapper/vg00-lvpatrol on /patrol type ext3 (rw)
/dev/mapper/vg00-clusterlv on /Cluster_Scripts type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
//172.22.225.73/nwapex/data on /opt/nwapex type cifs (rw,mand)
//172.22.100.127/mirth on /opt/mirth type cifs (rw,mand)
//172.22.111.87/data on /opt/bn1 type cifs (rw,mand)
//172.22.225.130/kronos/InterfaceDesigner/Interface Source Files on /opt/kronos type cifs (rw,mand)
//172.22.41.201/Company on /opt/proscript type cifs (rw,mand)
//172.22.100.244/ASD on /opt/murphy type cifs (rw,mand)
//172.22.100.252/StarData on /opt/epsi type cifs (rw,mand)
nfsd on /proc/fs/nfsd type nfsd (rw)
none on /sys/kernel/config type configfs (rw)
/dev/mapper/hbovg-hbo on /hbo type ext3 (rw)
/dev/mapper/hbovg-hboc on /hboc type ext3 (rw)
/dev/mapper/hbovg-mis on /mis type ext3 (rw)
/dev/mapper/hbovg-temphbo on /temphbo type ext3 (rw)
[root@chevelle ~]# ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 8ebe73bc-0939-401d-b3e4-1d193e433abe -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-
by-id/ by-label/ by-path/ by-uuid/
[root@chevelle ~]# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:08:08.0-sas-0x50026b9139522b00:4:0-0x5a4badb42b75c60c:0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0 -> ../../sdg
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0-part1 -> ../../sdg1
[root@chevelle ~]# ls -l /dev/disk/by-
by-id/ by-label/ by-path/ by-uuid/
[root@chevelle ~]# ls -l /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 boot -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-id/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001700411a636d -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001702411a6421 -> ../../sde
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001704411a6737 -> ../../sdb
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb000291e1400001704411a6737-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb000291e1400001706411a67a7 -> ../../sdd
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb0002b75c6000015fa525d18aa -> ../../sdf
lrwxrwxrwx 1 root root 9 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 scsi-36a4badb021d32c00132d9598938212dc-part3 -> ../../sda3
[root@chevelle ~]# ls -l /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 boot -> ../../sda1
[root@chevelle ~]# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:03:00.0-scsi-0:2:0:0-part3 -> ../../sda3
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:08:08.0-sas-0x50026b9139522b00:4:0-0x5a4badb42b75c60c:0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0 -> ../../sdg
lrwxrwxrwx 1 root root 10 Nov 13 03:47 pci-0000:0a:08.0-sas-0x50026b9139525800:4:0-0x5a4badb4291e140c:0-part1 -> ../../sdg1
[root@chevelle ~]# ls -l /dev/disk/by-uuid/
total 0
lrwxrwxrwx 1 root root 10 Nov 13 03:47 8ebe73bc-0939-401d-b3e4-1d193e433abe -> ../../sda1
[root@chevelle ~]# ls /dev/mapper/
control hbovg-h1686n1.v01a hbovg-h1686n1.v06a hbovg-h1686n1.v11a hbovg-hbo mpath6p1 vg00-home vg00-usr
hbovg-h1686n1.bila hbovg-h1686n1.v02a hbovg-h1686n1.v07a hbovg-h1686n1.v12a hbovg-hboc mpath7 vg00-lvpatrol vg00-usrlocal
hbovg-h1686n1.jn1a hbovg-h1686n1.v03a hbovg-h1686n1.v08a hbovg-h1686n1.v13a hbovg-mis mpath8 vg00-opt vg00-var
hbovg-h1686n1.jn2a hbovg-h1686n1.v04a hbovg-h1686n1.v09a hbovg-h1686n1.v14a hbovg-temphbo mpath9 vg00-root
hbovg-h1686n1.v00a hbovg-h1686n1.v05a hbovg-h1686n1.v10a hbovg-h1686n1.v15a mpath6 vg00-clusterlv vg00-tmp
[root@chevelle ~]# blkid
/dev/mapper/vg00-tmp: LABEL=”/tmp” UUID=”7f389f25-cd20-4b24-ac68-04e9af0ebd04″ TYPE=”ext3″
/dev/mapper/vg00-opt: LABEL=”/opt” UUID=”97e5e00c-0ac4-4821-8248-1cba50920e9b” TYPE=”ext3″
/dev/mapper/vg00-home: LABEL=”/home” UUID=”849f2a65-05a6-42b1-af8c-7ead1b33fb9f” TYPE=”ext3″
/dev/mapper/vg00-usrlocal: LABEL=”/usr/local” UUID=”d0c46c4c-2a2c-4c4b-a1b8-d8a83498b5d9″ TYPE=”ext3″
/dev/mapper/vg00-usr: LABEL=”/usr” UUID=”64887754-c0bc-442b-9b48-f785aa5a0c5c” TYPE=”ext3″
/dev/mapper/vg00-var: LABEL=”/var” UUID=”fed3d412-6b77-4014-b8a9-17471c922399″ TYPE=”ext3″
/dev/mapper/vg00-root: LABEL=”/” UUID=”19e1edd4-ac66-4c2e-8c26-2a8555539b65″ TYPE=”ext3″
/dev/sda2: TYPE=”swap”
/dev/sda1: LABEL=”/boot” UUID=”8ebe73bc-0939-401d-b3e4-1d193e433abe” TYPE=”ext3″
/dev/vg00/root: UUID=”19e1edd4-ac66-4c2e-8c26-2a8555539b65″ TYPE=”ext3″ LABEL=”/”
/dev/scd0: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/mapper/vg00-clusterlv: UUID=”485aea89-b699-49d5-8d87-bae50f80c9e7″ TYPE=”ext3″
/dev/mapper/vg00-lvpatrol: LABEL=”/patrol” UUID=”86ff22dd-d895-4a2f-beca-f3cc8b5e7bd0″ TYPE=”ext3″
/dev/dvd: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/sr0: LABEL=”MD3000_2.2.0.17″ TYPE=”iso9660″
/dev/mapper/hbovg-hbo: UUID=”cf90e0da-30b7-41b6-a73d-5e39bdddd013″ TYPE=”ext3″
/dev/mapper/hbovg-hboc: UUID=”43cade5f-ab81-4a7d-b134-ad0917c999e3″ TYPE=”ext3″
/dev/mapper/hbovg-mis: UUID=”14265200-9022-496b-bc6f-8ae3e00c3f13″ TYPE=”ext3″
/dev/mapper/hbovg-temphbo: UUID=”8f1af44c-2207-49ed-91f9-c44283794713″ TYPE=”ext3″
[root@chevelle ~]#

Posted in Linux | Leave a comment

Upgrading ISC Bind and DHCP

Our secondary DNS and DHCP server died last Sunday. Besides some people noticing some services were slower on the network it was a non-event, and that is a good thing. Rather than just doing a restore of the old server, we decided to go ahead and upgrade the OS to the latest version of Red Hat and DHCP and DNS to whatever was supported on that Red Hat version. I realize that is the easy way out, but we used to run a hand compiled version and I just did not see the advantage. I am going to take the time to document the upgrade process for those planning their upgrade.

After NS2 died the primary DHCP server started to run out of leases because the peer held all of the free leases so we told the primary that it’s peer was down. Make sure you have an omapi port defined in your dhcpd.conf file:

# This is for omshell
omapi-port 7911

From this site we got the basics for the following script:

omshell << EOF
connect
new failover-state
set name = "dhcp-failover"
open
set local-state = 2
update
EOF

Here are the options for setting fail over state in omshell:

 
/* A failover peer's running state. */
enum failover_state {
unknown_state			=  0, /* XXX: Not a standard state. */
startup				=  1,
normal				=  2,
communications_interrupted	=  3,
partner_down			=  4,
potential_conflict		=  5,
recover				=  6,
paused				=  7,
shut_down			=  8,
recover_done			=  9,
resolution_interrupted		= 10,
conflict_done			= 11,

Here are all of the DNS/DHCP servers we built for the upgrade:
NS1 — Primary server that needed to be upgraded, physical machine.
NS2 — Secondary server, DOA physical machine.
NS3 — Temporary secondary server, virtual machine.
NS4 — New primary DNS/DHCP server, physical machine.
NS5 — New test primary DNS/DHCP server, virtual machine.
NS6 — New test secondary DNS/DHCP server, virtual machine.

The plan was to test the upgrade process on NS5 and NS6 while one of the other team members built NS4. This may look like overkill but let me explain the rationale behind each server. After the failure of NS2, the first thing we did was stand up a third DNS server, NS3, as a new secondary so that we had a live copy of all of our zones should something happen to our primary DNS. Initially we turned on DHCP for this server as well but because the versions of failover protocol were differed between the servers, we just left DNS running. The failover protocols between versions 3.0 and 3.1 are different enough that they are not compatible. This server was not actually being queried by end users but was there as a failsafe option should we need one. It has been left running as an immediate option for the future.

Once we got a secondary server that would maintain current state we started building servers for the upgrade process. NS4 would eventually become the new primary DNS/DHCP server and is a physical machine. When it was brought online it was first a secondary server to to NS1 so that it had a complete DNS database, then promote it to the new NS1. Because a physical machine takes so much longer to build we spun up NS5 and NS6 as test servers quickly. The plan was to test on NS5 and NS6, promote NS6 to be the new NS2 and convert NS4 to the new NS1. The reason we didn’t just build and move was because we did not want to have to change our IP helper addresses throughout our network.

Here is a step-by-step outline of the actual go live.

1. Build NS4 as secondary DNS server to NS1 so that it has a copy of the DNS database and we don’t have to copy files from NS1.

2. Secure shell into each of the servers to be worked on during this time.
ssh into ns1 on the backp NIC.
ssh into ns2 on the backup NIC.
ssh into ns4 on the backup NIC.
We have a dedicated network for backup traffic, I got into the backup NIC so that I could manipulate the primary addresses without losing connectivity to the servers.

3. Stop DHCP on NS1 and copy the lease data base to the other servers.
service dhcpd stop
scp /var/state/dhcp/dhcpd.ad.leases root@ns2.chainringcircus.org:/var/state/dhcp/dhcpd.leases
scp /var/state/dhcp/dhcpd.ad.leases root@ns4.chainringcircus.org:/var/state/dhcp/dhcpd.leases

3. Shut the interfaces on NS1 before taking down DNS.
ifconfig eth0 down
ifconfig eth1 down

4. Start DHCP on NS2 so that we don’t have too many problems.
service dhcpd start

5. Shut down DNS on NS1
rndc freeze — Make sure there are no .jnl files left.
service named stop

6. Convert NS4 to NS1, we left NS1 up for now in case we needed to copy files or bring this server back online.
Change /etc/sysconfig/network to be ns1.chainringcircus.org

Change the addresses from NS4 to NS1
cp ~/DNS-Primary/ifcfg-eth0 /etc/sysconfig/network-scripts/
cp ~/DNS-Primary/ifcfg-eth1 /etc/sysconfig/network-scripts/
cp ~/DNS-Primary/named.conf.primary /etc/named.conf

7. We tested to make sure everything was running correctly and then rebooted the NS1 to make sure it came up correctly.
shutdown -r now

8. Shut down NS1 for the last time.
shutdown -h now

 

Posted in Linux | Leave a comment