Virtual Switching System (VSS)

This post was tedious in the formatting and was one of the reasons I put off posting it. The notes were taken months ago put I was weary of posting it because of the time involved in formatting it. As a result this post could have been better, I just dreaded working on it. Either I have to find a new editor or this will be the last post of this variety with unordered lists and line items to make bulleted points.

A few months ago we replaced our cores with a pair of 6509Es. The night of go live we had trouble because of some decisions we made at the last minute, and these notes saved us. I hope you find them as useful as we did. These are my notes from the design guide.

Because this post turned out so long, I put what most people will want to see at the top, the configuration. My notes from the configuration guide follow.

Configuration

1. Define the domain ID.

VSS(config)# switch virtual domain 100

2. Set the switch number:

VSS(config-vs-domain)# switch 1

2a. Have the switches use virtual MAC addresses:

VSS(config-vs-domain)# mac-address use-virtual

2b. Check to make sure OOB is active and set to 480 seconds.

VSS# sh mac-address-table synchonize statistics !sh stats for OOB

3. Configure VSL port-channel

VSS (config-vs-domain)# exit

3a.
Standalone SW1:

no hw-module 1 oversubscription
no hw-module 2 oversubscription
int po1
 switch virtual link 1
int rnage t1/1, t2/1
 channel-gr 1 mode on

Standalone SW2:

no hw-module 1 oversubscription
no hw-module 2 oversubscription
int po2
 switch virtual link 2
int rnage t1/1, t2/1
 channel-gr 2 mode on

4. Convert to VSS mode:

VSS# switch convert mode virtual
(Switch will ask to reload)
(Reload)

5. Only the first time conversion is this needed, this merges only VSL-related configurations, they say you MUST execute this command:

VSS# switch accept mode virtual

6. Configure fast-hello for dual-active detection. (p.4-29)

! Enable fast-hello under VSS global config.
VSS(config)# switch virtual domain 100
VSS(config-vs-domain)# dual-active detection fast-hello

!Enable fast-hello at the interface level
VSS(config)# int gi1/5/3
VSS(config-if)# dual-active fast-hello

VSS(config)# int gi2/5/3
VSS(config-if)# dual-active fast-hello

! Confirm fast-hello
VSS# sh switch virtual dual-active fast-hello
VSS# remote command standby-rp show switch virtual dual-active fast-hello

Commands
These are some commands that I kept for handy reference.

sh vslp lmp neighbor

VSS#sh vsl lmp nei

Instance #2:


  LMP neighbors

    Peer Group info:        # Groups: 1         (* => Preferred PG)

PG #    MAC             Switch  Ctrl Interface  Interfaces
---------------------------------------------------------------
*1      9999.aaaa.0000  1       Te2/5/4         Te2/5/4, Te2/5/5

sh switch virtual role

VSS#sh switch virtual role

Switch  Switch Status  Priority     Role    Session   ID
        Number         Oper(Conf)           Local    Remote
------------------------------------------------------------------
LOCAL    2     UP       100(100)    ACTIVE   0        0   
REMOTE   1     UP       100(100)    STANDBY  9111     9273

sh int vsl

VSS#sh int vsl

VSL Port-channel: Po1  
 Port: Te1/5/4
 Port: Te1/5/5

VSL Port-channel: Po2  
 Port: Te2/5/4
 Port: Te2/5/5

sh switch virtual

VSS#sh switch virtual            
Switch mode                  : Virtual Switch
Virtual switch domain number : 100
Local switch number          : 2
Local switch operational role: Virtual Switch Active
Peer switch number           : 1
Peer switch operational role : Virtual Switch Standby

sh switch virtual redundancy
VSS#sh switch virtual redundancy 
                  My Switch Id = 2
                Peer Switch Id = 1
        Last switchover reason = none
    Configured Redundancy Mode = sso
     Operating Redundancy Mode = sso

Switch 2 Slot 5 Processor Information :
-----------------------------------------------
        Current Software state = ACTIVE
       Uptime in current state = 14 weeks, 4 days, 14 hours, 34 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-IPSERVICESK9_WAN-M), Version 12.2(33)SXJ1, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Wed 22-Jun-11 18:03 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-ipservicesk9_wan-mz.122-33.SXJ1.bin,12;
        Configuration register = 0x2102
                  Fabric State = ACTIVE
           Control Plane State = ACTIVE

Switch 1 Slot 5 Processor Information :
-----------------------------------------------
        Current Software state = STANDBY HOT (switchover target)
       Uptime in current state = 14 weeks, 4 days, 14 hours, 31 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-IPSERVICESK9_WAN-M), Version 12.2(33)SXJ1, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2011 by Cisco Systems, Inc.
Compiled Wed 22-Jun-11 18:03 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-ipservicesk9_wan-mz.122-33.SXJ1.bin,12;
        Configuration register = 0x2102
                  Fabric State = ACTIVE
           Control Plane State = STANDBY

sh vsl rrp summ

VSS#sh vsl rrp summ
 RRP Summary:
------------------------------------------------------------------------
RRP information for Instance 2

--------------------------------------------------------------------
Valid  Flags   Peer      Preferred  Reserved
               Count     Peer       Peer

--------------------------------------------------------------------
TRUE    V        1           1          1

        Peer  Valid  Switch Status  Priority   Role    Local   Remote
Switch  Group        Number         Oper(Conf)         SID     SID
---------------------------------------------------------------------
Local    0     TRUE    2      UP     100(100)  ACTIVE   0       0   
Remote   1     TRUE    1      UP     100(100)  STANDBY  9111    9273

Peer 0 represents the local switch

Flags : V - Valid 

sh mls cef

 
sh mls cef
Codes: decap - Decapsulation, + - Push Label
Index  Prefix              Adjacency             
64     0.0.0.0/32          receive
! Removed for brevity

sh mac-address-table synchronize statistics

VSS#sh mac-address-table synchronize statistics 

MAC Entry Out-of-band Synchronization Feature Statistics:
---------------------------------------------------------

    Switch [1] Module [1]
    -----------------------

    Module Status:
Statistics collected from Switch/Module             :  1/1
Number of L2 asics in this module                   :  1

! Removed for brevity.

sh switch virtual redundancy mismatch

VSS#sh switch virtual redundancy mismatch 

No Config Mismatch between Active and Standby switches 

redundancy reload peer

! Reload a switch from RPR mode to hot-standby
VSS#redundancy reload peer
! Did not get output.

Configuration Guide Notes Below

Virtual Switch Member Boot-up Behavior

  • Diagnostics
  • VSL Link Initialization
  • LMP Establishment
  • Role negotiation through RRP

Link Management Protocol (LMP)

  • Establishes and verifies bidirectional communication during startup and normal operation
  • Exchange switch ID
  • Sends hello packets to monitor health of VSL and peer

Role Resolution Protocol (RRP)

  • Determines the operational status of each switch member.

Virtual Switch Link (VSL)
Each member link must be configured configured in unconditional EtherChannel mode:
channel-group 12 mode on

Stateful Switch Over (SSO)

  • Enables supervisor redundnacy in a standalone 6000, keeping the backup
    supervisor up to date.
  • State 13-Active If in active state the supervisor is responsible for forwarding
    and managing the control plane. Manage control plane functions,
    synchornizes the configuration and the protocols.
  • State 8-Standby Supervisor is synchronized with with the active. This is the
    final state hot-standby supervisor.
  • SSO is the core of VSS, however VSS is a dual forwarding solution while the
    control plane is managed by one supervisor.

Virtual Switch Prioroity

  • The first to boot will become active.
  • If simultaneous boot, lowest switch ID becomes active.
  • Highest priority wins, except highest priority switch will not become active unless preemption is enabled.
  • Default priority is 100.
  • Switch preemption should not be taken lightly.
    • It forces multiple reboots of the VSS member.
    • Cisco recommends _not_ configuring preemption.

Multi-chassis Etherchannel (MEC)

  • Preferred connectivity method using VSS.
  • Extends etherchannel to from multiple ports on one switch to multiple ports on two chassis.
  • Access-layer switches are configured with traditional etherchannel.
  • VSS with MEC is loop-free.

MEC Configuration

  • Do not explicitly create layer-2 MEC from the CLI, allow IOS to generate the interface.
  • Create a layer-3 MEC explicitly and associate the port-channel group under each member interface.
  • This syslog configuration command is recommended in VSS with MEC interfaces.
  • 	int po20
    	 logging event link-status
    	 logging event spanning-tree status
    	
  • These hidden commands are now available in 12.2(33)SXH1
  • 	remote command switch test EtherChannel load-balance interface po 1 ip 1.1.1.1 2.2.2.2
    	show EtherChannel load-balance hash-result interface port-channel 2 205 ip 10.120.7.65 vlan 5 10.121.100.49
    	

MAC Addresses

  • MAC address allocation is derived from the back plane EEPROM on each chassis, therefore a VSS instance has two pools. The VSS MAC address pool is determined by RRP. MAC address allocation does not change during a switch over event, however, MAC addresses will change in the event both switches reboot without the mac-address use-virtaul command. This avoids gratuitous ARPs.
  • When upgrading the change of MAC address for the default gateway can cause problems for hosts not capable of updating the default gateway ARP entry. It is typically cached for four hours.
  • MAC Out-of-Band Sync (OOB)
  • MAC addresses normally age out age out in a single chassis environment.
  • Depending upon the etherchannel hash MAC addresses have the chance to age out because they are not updated.
  • MAC OOB is designed to synchronize MAC addresses in all line cards of the VSS over the VSL.
  • In VSS trunk mode of a port-channel interfaces being desirable or undesirable does not act the same as in standalone mode. When a link member is brought on line it is not a separate negotiation, it is an addition to MEC. p.3-25

PAgP

  • The active switch is responsible for origination and termination of PAgP control plane traffic.
  • The same device ID is sent by both VSS switches so the end device assumes a single logical device.
  • Cisco recommends PAgP neighbors to be in desirable-desirable mode with the silent sub option.

LACP p. 2-37

  • In VSS it works for both layer-2 and layer-3 interfaces.
  • The recommended mode for LACP neighbors is Active-Active
  • During the EtherChannel bundling process LACP performs a configuration consistency check on each link trying to become a port-channel member.
  • If a port does not pass it is placed in a “lettered” system bundle.
  • The first etherchannel bundle contains the ports that passed the configuration check.
  • The second “lettered” bundle includes the ports that did not pass the configuration check.
  • Avoid using the min-links LACP command
  • Avoid LACP fast-hello in VSS
    • During failover and recovery the VSS might not be able to recover before the remote end declares VSS down. False positive.
    • Fast-hello as sent per link which can overrun a switch CPU in large deployments.

6500-VSS# show etherchannel 20 summary | inc Gi
Po20(SU)	LACP	Gi2/1(P)
Po20B(SU)	LACP	Gi2/2(P) ! Bundled in separate system-generated 
				 			 ! port-channel interface

Implementation Notes

Recommended to have one port from the supervisor and one from a line card, however, the have different queue structures and the etherchannel bundle would fail. To fix this turn on:

no mls channel-consistency

The Sup720-10G uplink port can be configured in one of two modes:

  • Default, Non-10g-only mode:
    • All supervisor ports have the same CoS queuing mode if any 10G port is used for VSL. VSL only allows CoS-based queuing.
  • Non-blocking, 10g-only mode:
    • All 1G ports are disabled, the entire module operates in non-blocking mode. 12.2(33)SXI allows non-VSL 10G ports to be DSCP based.

Resilient VSL Design Options (p2-18 thru 2-20)

  • Use the two 10G ports on the Sup720-10G supervisor.
    • Most common, does not provide optimal hardware diversity.
  • Use on 10G port on the Sup720-10G and another from a VSL capable line card.
    • Best for balancing cost and redundancy.
  • Use 10G ports on two separate VSL capable line cards.
    • Best option for flexibility but not as cost effective.

EtherChannel
Etherchannel is the fundamental building block of VSS. Traditionally load
sharing and failure are governed by STP, FHRP and topology (looped and
non-looped). In VSS Etherchannel replaces all three.

  • The etherchannel hash algorithm becomes more important to get right in VSS.
  • Layer-4 hashing is more random than layer-3 hashing.
  • Layer-2 hashing is not as efficient when all hosts are sending to a default
    gateway.

There are a variety of etherchannel options in VSS.

VSS(config-if)# port-channel port hash-distribution X

By default the load-sharing hash method on all non-VSL etherchannel is fixed.

VLAN ID

Traffic optimized when:

  • With VSS it is possible to have more VLANs per closet.
  • Traffic might not be fairly hashed due to similarities such as default gateway or multicast traffic.
VSS# sh platform hardware pfc mode
VSS# sh etherchannel load-balance
  • Layer 3 and 4 Hash Tuning
    • dst-mixed-ip-port
    • src-dst-mixed-ip-port
    • rc-mixed-ip-port
  • For lower end switches:
    • Cisco Catalyst 4500
      • src-dst-ip
    • Cisco Catalyst 36xx, 37xx Stack, 29xx
      • src-dst-ip

Failures

Convergence

  • FHRP recovery default is 10 seconds, with tuning 900msec.
  • VSS 200msec convergence.
  • VSS Member Failures

  • Recovery is based on etherchannel, it detects the failure then rehashes the flow.
  • Core to VSS Failure

  • If all links fail from one VSS member to the core traffic will traverse the VSL.
  • Access Layer to VSS Failure

  • Traffic will flow over the VSL.
  • STP Loops and VSS

    • These issues can introduce a loop that STP might not block
      • Faulty hardware causes a missed BPDU
      • Faulty software cause high CPU load, preventing BPDU processing.
      • Configuration mistake
      • Non-standard switch implementation
    • VSS over comes these issues
      • Creates a loop free topology using MEC.
      • No FHRP needed, replaced by one logical node.

    Unidirection Link Detection (UDLD)

    • Aggressive UDLD should _not_ be used as link-integrity check. VSS is by definition a loop-free topology.
    • STP protocols (RPVST+ and MST) converge faster than UDLD detects.

    Spanning Tree Configuration with VSS

    • The root of the STP should always be VSS.
    • Loop guard is not needed.
    • The active switch is responsible for generating the BPDU.
      • Routing with VSS
        Layer-3 MEC is the recommended design rather than ECMP.

        Routing Protocols, Topology and Interaction

        • Two ways to connect VSS to the core:
          • Equal Cost Multipath (ECMP)
          • Layer-3 MEC

        Link Failure Convergence

        • The higher the number of routes the longer ECMP takes to recover.
        • Because MEC failer detection is hardware based, it does not matter
          the number of routes, the hardware will detect failure and adjust
          traffic to the healthy link.
        • Advantage MEC.

        Path availability during link failure

      • A single link failure in ECMP will result in path reprogramming.
      • Routing Protocol Interaction During Active Failure

        Dual Active Detection (p. 4-29)

        • PAgP
        • Fast-Hello
        • BFD

        Fast-Hello

        • Requires a dedicated physical port between the VSS nodes.
        • The dedicated link is not capable of carrying control-plan or user-data traffic.
        • During dual-active the that is configured to carry fast-hello is operational and continues
          to exchange hellos. If the old-active continues to see hellos during
          what it believes to be a failover state, then it knows dual-active has occurred.
        • The Sup720-10G 1Gb uplink ports can be used if the supervisor is not
          configured in 10Gb on mode.

        Configure fast-hello for dual-active detection. (p.4-29)

        ! Enable fast-hello under VSS global config.
        VSS(config)# switch virtual domain 100
        VSS(config-vs-domain)# dual-active detection fast-hello
        
        !Enable fast-hello at the interface level
        VSS(config)# int gi1/5/1
        VSS(config-if)# no shut
        VSS(config-if)# dual-active fast-hello
        
        VSS(config)# int gi2/5/1
        VSS(config-if)# no shut
        VSS(config-if)# dual-active fast-hello
        
        ! Confirm fast-hello
        VSS# sh switch virtual dual-active fast-hello
        VSS# remote command standby-rp show switch virtual dual-active fast-hello
        

        Using Bidirectional Forwarding Detection

        • BFD session establishment is the indication of dual-active condition.
        • Normally VSS would not be able to establish BFD with itself because it is one logical node.
        • BFD takes 20-25 seconds for detection.
          • Requires IP connectivity.
          • Needs IP processes and static route.

        Configure BFD for dual-active Detection

        VSS(config)# switch virtual domain 10
        VSS(config)# dual-active pair interface gi1/5/1 int gi2/5/1 bfd
        !
        ! Enable unique IP subnet and BFD interval on interfaces.
        VSS(config)# int gi1/5/1
        VSS(config-if)# ip add 192.168.1.1 255.255.255.0
        VSS(config-if)# bfd interval 50 min_rx 50 multiplier 3
        !
        VSS(config)# int gi2/5/1
        VSS(config-if)# ip add 192.168.2.1 255.255.255.0
        VSS(config-if)# bfd interval 50 min_rx 50 multiplier 3
        !
        ! The static route is automatically added.
        ! Confirm and monitor BFD.
        VSS# sh switch virtual dual-active bfd
        VSS# sh switch virtual dual-active summary
        

        Dual-Active Recovery

        • Once the VSL connectivity is established RRP handles the negotiation.

        OSPF Tuning

        VSS(config)# router ospf 100
        VSS(config-router)# nsf
        VSS(config-router)# auto-cost reference bandwidth 20000
        ! Confirm OSPF
        VSS# sh ip ospf neighbor detail
        VSS# sh ip protocol
        
    This entry was posted in Routing. Bookmark the permalink.

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s