Wednesday, January 29, 2014

Troubleshooting Cisco Nexus 5500 IGMP and Non-Routed Multicast

I came across a unique issue a while ago that I thought would make a great blog topic with the Nexus 5500/2248 platforms and a server cluster attempting to sync/peer through the use of IP multicast. Strangely the cluster would constantly drop adjacencies, and was a bit of a mystery. Being an IT consultant that works with customers to design and implement data center infrastructure, most of the time we (the consultants) don't have any background info on lesser known or custom applications. To compound that even further, many times sysadmins are not very network savvy, and do not understand how the application operates from a low-level network perspective.

This particular issue all started late one night during a cutover to move server connections from an old Brocade switch to a new Nexus infrastructure (this cluster was just a fraction of the servers migrating). Initially all connections were migrated to Nexus 2248TPs hanging off of Nexus 5596UPs [FWIW, running NX-OS version 5.1(3)N1(1)], and all servers appeared to be working just as they were. Once the sysadmin starting looking deeper into this particular server cluster, it was found that cluster adjacencies would form, then fail for no apparent reason.

Just based off of that word, I immediately started checking for physical link errors, speed/duplex settings, and logs on the Nexus for any indication of problems. Of course that would be too easy! The links were error free, logs were clean, links negotiated at 1000/full, and to top it off, the interface counters for the servers were incrementing packet counters like they were operating just fine. And they were, somewhat, since the sysadmin had no issue logging into the servers, it's just this application cluster operation that was failing. The cluster had been operating just fine on the Brocade switch - which was L2-only, and essentially a dumb switch.

Me: "Ok Sysadmin, what does the application cluster software need from the network in order to operate?"

Sysadmin: "Well I believe it's multicast."...after further digging in documentation... "Yes it is multicast and it's using multicast group 224.1.1.1."

Me: "Are the adjacencies just never forming, or are some partially up?"

Sysadmin: "It appears some of the servers form a adjacency, but then a few minutes later it drops. It appears to keep cycling through randomly"

Now I had more info on where to further isolate the problem and extra details about the failure occurring. What did the server switchport configs look like?

interface Ethernet141/1/22
  switchport access vlan 200
  spanning-tree port type edge

Ok, that's a pretty plain-jane server config. Interesting to note that VLAN 200 in this case is a non-routed VLAN, meaning there is no SVI, router, or any other L3 gateway in that VLAN. The Nexus 5596UPs in this instance did not have the L3 module either. No L3 device on the VLAN - that could be a problem - let's investigate IGMP, which is what hosts use to communicate multicast group membership.

N5k-A# sh ip igmp snooping 
Global IGMP Snooping Information:
  IGMP Snooping enabled
  Optimised Multicast Flood (OMF) disabled
  IGMPv1/v2 Report Suppression enabled
  IGMPv3 Report Suppression disabled
  Link Local Groups Suppression enabled
  VPC Multicast optimization disabled

[...other VLAN output omitted...]

IGMP Snooping information for vlan 200
  IGMP snooping enabled
  Optimised Multicast Flood (OMF) disabled
  IGMP querier none
  Switch-querier disabled
  IGMPv3 Explicit tracking enabled
  IGMPv2 Fast leave disabled
  IGMPv1/v2 Report suppression enabled
  IGMPv3 Report suppression disabled
  Link Local Groups suppression enabled
  Router port detection using PIM Hellos, IGMP Queries
  Number of router-ports: 1
  Number of groups: 0
  VLAN vPC function enabled
  Active ports:
    Po2 Po999   Eth144/1/1      Eth144/1/42
    Eth144/1/43 Eth141/1/18     Eth141/1/20     Eth141/1/22
    Eth141/1/24 Eth142/1/39     Po416   Po434
    Po436       Po437   Po438   Po444

N5k-A# sh mac address-table int e141/1/22
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY   Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 200      0011.1111.a2ec    dynamic   0          F    F  Eth141/1/22

Ok, so all the server ports in VL200 show in the IGMP Snooping table, and Snooping is enabled by default. There's no L3 device to respond to hosts on the VLAN with IGMP Queries (hosts use IGMP Reports to request a multicast group), which is the communication that keeps an intermediary switch with IGMP Snooping 'in-the-know' about the multicast needs on the VLAN. Also to note there's no MAC address learned for the multicast group that server is trying to join.

What I really like about NX-OS is the very detailed logs (debug level) that it stores for just about every process running, at all times.

N5k-A# sh ip igmp snooping event-history vlan

 vlan Events for IGMP Snoop process
2012 Aug 21 02:07:26.634368 igmp [3344]: [3370]: Noquerier timer expired, remove all the groups in this vlan.
2012 Aug 21 02:07:26.634356 igmp [3344]: [3370]: IGMPv3 proxy report: no records to send
 
2012 Aug 21 02:04:27.421333 igmp [3344]: [3557]: Forwarding the packet to router-ports  
2012 Aug 21 02:04:27.421299 igmp [3344]: [3557]: IGMPv3 proxy report: no records to send
2012 Aug 21 02:04:27.421262 igmp [3344]: [3557]: Updated oif Eth141/1/22 for (*, 224.1.1.1) entry
2012 Aug 21 02:04:27.421242 igmp [3344]: [3557]: Received v3 report: group 224.1.1.1 from 10.11.200.11 on Eth141/1/22
2012 Aug 21 02:04:27.421233 igmp [3344]: [3557]: Record type: "change-to-exclude-mode" for group 224.1.1.1, sources count: 0
2012 Aug 21 02:04:27.421228 igmp [3344]: [3557]: Processing v3 report with 1 group records, packet-size 16 from 10.11.200.11 on Eth141/1/22

2012 Aug 21 02:04:27.421167 igmp [3344]: [3557]: Process a valid IGMP packet type:34 iod:390

In bottom-to-top order of events:
  • IGMP packet is received
    • packet type = 34 = IGMP Version 3 Membership Report
  • Processing packet with 1 multicast group listed
  • IGMPv3 Membership Report Message = 'change-to-exclude-mode' for our group 224.1.1.1
    • There are 0 multicast sources listed to exclude - meaning any source will do
  • IGMPv3 report with 1 multicast group seen from host 10.11.200.11 on Eth141/1/22
  • OIF (Outgoing Interface) Eth141/1/22 for (*,224.1.1.1)
  • No IGMP Proxy info the N5k has stored
    • From Cisco N5k documentation - "The [IGMP] proxy feature builds the group state from membership reports from the downstream hosts and generates membership reports in response to queries from upstream queriers."
  • Forward the IGMP packet to 'router-ports', which the only one in this system is the VPC Peer-Link
Exactly 3 minutes later (this was a consistent timer, but I can't find any documentation on why - IGMP group timeout defaults are 260sec, Querier timeout default is 255sec, bug maybe?).
  • Still no IGMP proxy records, and since the Nexus never 'saw' an IGMP Query, the 'Noquerier timer expired, remove all groups in this vlan.'
Boom! This correlates to the constant up/down of multicast adjacencies the sysadmin was seeing. We were then also able to watch a particular server that had a successful peer, drop, and matched the timestamps.

Using the debug command will give you slightly more information than the 'event-history' command (note this output is on the peer N5k, receiving the IGMP packet from the Peer-Link Po999):

N5k-B# debug ip igmp snooping vlan
2012 Aug 21 03:31:07.115153 igmp: SNOOP: [vlan 200] Process a valid IGMP packet type:34 iod:15
2012 Aug 21 03:31:07.115195 igmp: SNOOP: [vlan 200] Processing v3 report with 1 group records, packet-size 16 from 10.11.200.11 on Po999 
2012 Aug 21 03:31:07.115215 igmp: SNOOP: [vlan 200] Record type: "change-to-exclude-mode" for group 224.1.1.1, sources count: 0 
2012 Aug 21 03:31:07.115290 igmp: SNOOP: [vlan 200] Received v3 report: group 224.1.1.1 from 10.11.200.11 on Po999 
2012 Aug 21 03:31:07.115318 igmp: SNOOP: [vlan 200] Created ET port Po999 for group 224.1.1.1 
2012 Aug 21 03:31:07.115342 igmp: SNOOP: [vlan 200] Created ET host-entry 10.11.200.11 on port Po999 for group 224.1.1.1
2012 Aug 21 03:31:07.115371 igmp: SNOOP: [vlan 200] Created igmpv3 oif Po999 for (*, 224.1.1.1)
2012 Aug 21 03:31:07.115548 igmp: SNOOP: In function igmp_snoop_copy_del_ifindex_list: 
2012 Aug 21 03:31:07.115719 igmp: SNOOP: [vlan 200] Updated oif Po999 for (*, 224.1.1.1) entry 
2012 Aug 21 03:31:07.115809 igmp: SNOOP:  Processing reportfrom_cfs: 1, on_internal_mcec: 0, im_is_iod_valid: 1im_is_if_up: 1 im_id_ifindex_veth = 0, rc = 0, ifindex = Po999
2012 Aug 21 03:31:07.115826 igmp: SNOOP: [vlan 200] IGMPv3 proxy report: no records to send 
2012 Aug 21 03:31:07.115854 igmp: SNOOP: [vlan 200] Forwarding the packet to router-ports , came from cfs  
2012 Aug 21 03:31:07.115875 igmp: SNOOP: [vlan 200] not sending the CFS packet back to MCT 
[...]
2012 Aug 21 03:34:10.102439 igmp: SNOOP: [vlan 200] IGMPv3 proxy report: no records to send 
2012 Aug 21 03:34:10.212437 igmp: SNOOP: [vlan 200] IGMPv3 proxy report: no records to send 
2012 Aug 21 03:34:23.432430 igmp: SNOOP: [vlan 200] Noquerier timer expired, remove all the groups in this vlan. 

And even more ways to see this 'flapping' problem in action through the use of the debug command above and regular 'show' commands:

N5k-A# sh ip igmp snooping groups vlan 200
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
200   */*                -    R     Po999
200   224.1.1.1          v3   D     Po416

2012 Aug 21 03:41:28.765007 igmp: SNOOP: [vlan 200] Noquerier timer expired, remove all the groups in this vlan. 

N5k-A# sh ip igmp snooping groups vlan 200
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
200   */*                -    R     Po999

N5k-A# sh ip igmp snooping mrouter vlan 200
Type: S - Static, D - Dynamic, V - vPC Peer Link,       I - Internal, F - Fabricpath core port
      C - Co-learned, U - User Configured
Vlan  Router-port   Type      Uptime      Expires
200   Po999         SV        33w5d       never

So how do we fix this issue? Well there are a few ways, but in this instance I added Static IGMP Snooping mappings for the 224.1.1.1 multicast group to each server switchport (there were only a handful of ports).

Other methods to fix this would be
  • Add a L3 gateway into the VLAN to reply to the IGMP messages so the snooping would work correctly
  • Configuring a manual IGMP Snooping Querier (for situations like this where there is no PIM running because the traffic isn't routed)
  • Disabling snooping for that VLAN altogether
I definitely didn't want to disable snooping since we don't want that traffic flooded throughout the VLAN, and due to the existing customer layout of things (and security) I also did not want to create a route-able entry point into that VLAN. Going the route of a manual IGMP Snooping Querier would have been an option, but that would have then required locating an IP address to use late night (IPAM is usually an afterthought for many). Using static mappings for specific interfaces allows the most granular control, but at the cost of a little extra complexity (document!).

N5k-A(config)# vlan configuration 200
N5k-A(config-vlan-config)# ip igmp snooping static-group 224.1.1.1 interface po416
Warning: This command should be executed on peer VPC switch [vlan 200] as well.
N5k-A(config-vlan-config)# 
2012 Aug 21 03:49:17.225983 igmp: SNOOP: [vlan 200] Interface Po416 (mode trunk) check for vlan 200: access 0, native 0, trunk-allowed 1 


N5k-A# sh ip igmp snooping groups vlan 200
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
200   */*                -    R     Po999
200   224.1.1.1          v3   S     Po416

N5k-A(config)# vlan configuration 200
N5k-A(config-vlan-config)# ip igmp snooping static-group 224.1.1.1 interface Ethernet141/1/22
N5k-A(config-vlan-config)# ip igmp snooping static-group 224.1.1.1 interface port-channel438
N5k-A(config-vlan-config)# ip igmp snooping static-group 224.1.1.1 interface port-channel444


N5k-2A# sh ip igmp snooping event-history vlan 

 vlan Events for IGMP Snoop process
2012 Aug 21 04:12:37.019267 igmp [3344]: [3451]: Interface Eth141/1/22 (mode access) check for vlan 200: access 1, native 0, trunk-allowed 1
2012 Aug 21 04:06:31.977383 igmp [3344]: [3451]: Interface Po444 (mode trunk) check for vlan 200: access 0, native 0, trunk-allowed 1
2012 Aug 21 03:59:55.853893 igmp [3344]: [3451]: Interface Po416 (mode trunk) check for vlan 200: access 0, native 0, trunk-allowed 1
2012 Aug 21 03:48:37.843537 igmp [3344]: [3451]: Interface Po438 (mode trunk) check for vlan 200: access 0, native 0, trunk-allowed 1

  
N5k-2A# sh ip igmp snooping groups 
Type: S - Static, D - Dynamic, R - Router port, F - Fabricpath core port

Vlan  Group Address      Ver  Type  Port list
200   */*                -    R     Po999
200   224.1.1.1          v3   S     Eth141/1/22 Po416 Po438 Po444
          
          
N5k-2A# sh mac address-table int e141/1/22
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY   Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 200      0011.1111.a2ec    dynamic   10         F    F  Eth141/1/22
  200      0100.5e01.0101    igmp      0          F    F  Po999 
                                                          Eth141/1/22 Po416 Po438 
                                                          Po444           
So as you can see from the configuration and output above, the static IGMP Snooping mappings were added, the 'sh ip igmp snooping groups' command showed the server ports joined to the proper group, and the MAC address table showing a multicast MAC for the server ports. Once the statics were added, the sysadmin immediately saw the application cluster form all its adjacencies, and remained stable.

Monday, November 18, 2013

Cisco ACI - SDN in the DC, Cisco's Way

As you may have read by now, Cisco has announced their first big 'SDN' (Software Defined Networking) solution named ACI (Application Centric Infrastructure) that tightly pairs with the Nexus 9000 line (announced along with ACI). However, with most product announcements that are released far in advance to the actual product release, the technical details are very few and far in-between. I recently had the opportunity to attend a conference where I attended an ACI and Nexus 9000 breakout session discussion with presenter Joe Onisick (www.definethecloud.net), a Cisco TME for ACI/N9k.

From the discussions that followed, these were the interesting points and thoughts that stuck out to me about ACI and the N9k:

  • As Cisco has already stated, the N9k will be shipping soon, but they won't be able to run in ACI-mode until 2HCY14. The upgrade from standalone-mode (standard NX-OS) to ACI-mode will be a major upgrade, as the whole underlying OS/firmware is completely different. No ISSU upgrade.
  • The N9k and ACI is currently a Data Center only solution, in a CLOS fabric design (Spines and Leafs) with the APIC controller (Application Policy Infrastructure Controller). It was not designed to replace Core, WAN-edge, or Campus network environments - it will likely expand to these other environments after the technology gains momentum in the DC space. The whole concept of SDN is still very early in it's infancy - at least for everyone who isn't Google.
  • The N9k will be priced very competitively - partly due to the use of merchant silicon and mid-plane elimination - but I would say more importantly due to the DC-focused scope of software functionality. Technologies like OTV, LISP, etc will still require a N7k or ASR. Design guides will become available with how to integrate the ACI DC infrastructure with other areas of the network. Since it's using VXLAN as an overlay - there will certainly be a VXLAN-gateway functionality to have that integration.
  • 40G BiDi optics - man these are great (also announced along w/ ACI and the N9k)! 40GE over a single pair of OM3 MMF (good for 100m) using essentially CWDM, but only 2 waves (20G each). And they are able to manufacture and sell them very cost effectively. This could be a major Cisco differentiator when 40G becomes more of the norm. Businesses already have a lot of sunk cost into their fiber cable plants - would they rather replace/addon to accommodate the 12-strand MTP fiber cables for MMF 40G or use their existing 10G fiber plant? A 'no-brainer' decision. Some great info on them here: http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps13386/white-paper-c11-729493_ns1261_Networking_Solutions_White_Paper.html
  • How will the APIC controller look/feel/operate? It's still somewhat of a mystery, but I expect it to be very similar to the successful UCS Manager (configuring network/application policies with various metrics/SLAs). After all, the people at Insieme were also the people who created UCS and the Nexus products.
  • NSH - Network Service Header - a Cisco vPath-like technology that has been submitted to the IETF as a draft (http://tools.ietf.org/html/draft-quinn-nsh-00). See who co-authored it? Cisco and a certain company Cisco announced they were acquiring at the launch of ACI (Insieme). This appears to be one of the major underlying technologies that the APIC (the controller) will use to chain network services (firewalls, load balancers, etc). vPath is a really cool technology that the Cisco Nexus 1000v uses to communicate with VMware ESX and virtual network appliances (VSG, vASA, etc) for logically 'chaining' network services. That makes the Cisco AVS (Application Virtual Switch, also announced along w/ ACI) seem to fit quite nicely in the mix, as it's essentially a Nexus 1000v that communicates with the ACI infrastructure. With NSH having a fixed header, it makes it easily implemented into hardware - essentially doing the same function of the N1000v and vPath, but with the ability to have hardware ASICs participate in the service chaining.
Starting to see the potential of ACI now? There are still lots of technical details that are missing, and for that matter the actual product. It'll be very interesting to see how the market reacts to ACI and VMware's NSX. VMware has already released NSX, but will customers adopt it? Will NSX be production-ready by the time ACI/APIC are released; will customers see the need for tighter integration with network and other hardware (VMware has stated that they are working with networking vendors for interop, but how well will that turn out)? All questions that come to mind in terms of the race to see who wins SDN in the DC. The next couple years in the networking field are going to be really interesting.


Saturday, October 5, 2013

Cisco Catalyst 4500-X EtherChannel Auto-QoS

Typically in any size routing and switching infrastructure environment that real-time and business critical applications rely on, QoS is an absolute must. One of the main advantages of buying Cisco equipment is the extensive services that IOS can provide - one of which is granular QoS control.

As real-time services are increasingly converging to the IP network, namely voice and video, QoS is becoming even more important to ensure a quality end-user experience. Gigabit and multi-Gigabit (EtherChannel bundle) uplinks are becoming more saturated as users and the businesses increase the need for data-intensive network connectivity. Queue the requirement for QoS!

Unfortunately, between the different Business Units within Cisco that are responsible for the various Catalyst Switch products, there is a decent amount of feature and hardware disparity between them (Catalyst 2960, 3560/3750, 4500, 6500 - and the Nexus lines). QoS configurations between the different products can be very different, which makes understanding QoS in switched environments very cumbersome and easy to forget. This is especially true since there can often be an abundance of bandwidth available in a LAN, and easy for a network engineer/admin to discount the need for QoS.

Network Management tools may suggest to an engineer that a link is not congested, however, these tools rely on SNMP to poll the device for interface stats, taking the delta of the counters to show a rate. Often these tools cannot poll any faster than 30 second intervals, and usually are set to 1-5 minute intervals. This is really an average and doesn't account for spikes in utilization or microbursts. Once these spikes and microbursts of data become too large for the device buffers, packets drop. In the case of real-time traffic, even buffering of the data will cause a degradation of service because buffering this traffic causes delay and jitter.

Fortunately, Cisco has a great design guide for QoS called the "Medianet Campus QoS Design 4.0", also known as the QoS Solution Reference Network Design (SRND) 4.0 guide. Here are the web and PDF links to that document:

Most often the amount of granular control that is explained in the SRND 4 guide for campus switches is not needed because of the simple feature known to Cisco switches as Auto-QoS. Auto-QoS on switches is essentially a macro that has all of the recommended configurations from the QoS SRND 4 guide. With the Auto-QoS feature, it simplifies QoS for switches to just a few commands, and covers probably close to 95% of any QoS needs an enterprise might need. Since it's just a macro, the actual configurations created are easily modifiable for any specific or custom requirements.

There can be a few hiccups when using Auto-QoS, and one of them that I've run into on numerous occasions happens when trying to apply the Auto-QoS generated policies to an EtherChannel interface or physical interfaces linked to it.

The information in this article is in reference to Auto-QoS VoIP for the Cat4500-X.

To enable Auto-QoS on an interface use the following command:
4500X#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
4500X(config)#int te1/1
4500X(config-if)#auto qos voip trust

Once the above command is applied, IOS automatically generates the following QoS policy (based on SRND 4.0):
ip access-list extended AutoQos-4.0-ACL-Bulk-Data
 permit tcp any any eq ftp
 permit tcp any any eq ftp-data
 permit tcp any any eq 22
 permit tcp any any eq smtp
 permit tcp any any eq 465
 permit tcp any any eq 143
 permit tcp any any eq 993
 permit tcp any any eq pop3
 permit tcp any any eq 995
 permit tcp any any eq 1914
ip access-list extended AutoQos-4.0-ACL-Default
 permit ip any any
ip access-list extended AutoQos-4.0-ACL-Multimedia-Conf
 permit udp any any range 16384 32767
ip access-list extended AutoQos-4.0-ACL-Scavenger
 permit tcp any any eq 1214
 permit udp any any eq 1214
 permit tcp any any range 2300 2400
 permit udp any any range 2300 2400
 permit tcp any any eq 3689
 permit udp any any eq 3689
 permit tcp any any range 6881 6999
 permit tcp any any eq 11999
 permit tcp any any range 28800 29100
ip access-list extended AutoQos-4.0-ACL-Signaling
 permit tcp any any range 2000 2002
 permit tcp any any range 5060 5061
 permit udp any any range 5060 5061
ip access-list extended AutoQos-4.0-ACL-Transactional-Data
 permit tcp any any eq 443
 permit tcp any any eq 1521
 permit udp any any eq 1521
 permit tcp any any eq 1526
 permit udp any any eq 1526
 permit tcp any any eq 1575
 permit udp any any eq 1575
 permit tcp any any eq 1630
 permit udp any any eq 1630

class-map match-all AutoQos-4.0-Scavenger-Classify
  match access-group name AutoQos-4.0-ACL-Scavenger
class-map match-all AutoQos-4.0-Signaling-Classify
  match access-group name AutoQos-4.0-ACL-Signaling
class-map match-any AutoQos-4.0-Priority-Queue
  match cos  5 
  match  dscp ef 
  match  dscp cs5 
  match  dscp cs4 
class-map match-all AutoQos-4.0-VoIP-Data-Cos
  match cos  5 
class-map match-any AutoQos-4.0-Multimedia-Stream-Queue
  match  dscp af31 
  match  dscp af32 
  match  dscp af33 
class-map match-all AutoQos-4.0-Network-Mgmt
  match  dscp cs2 
class-map match-all AutoQos-4.0-VoIP-Signal-Cos
  match cos  3 
class-map match-any AutoQos-4.0-Multimedia-Conf-Queue
  match cos  4 
  match  dscp af41 
  match  dscp af42 
  match  dscp af43 
  match access-group name AutoQos-4.0-ACL-Multimedia-Conf
class-map match-any AutoQos-4.0-Transaction-Data
  match  dscp af21 
  match  dscp af22 
  match  dscp af23 
class-map match-all AutoQos-4.0-Network-Ctrl
  match  dscp cs7 
class-map match-all AutoQos-4.0-Scavenger
  match  dscp cs1 
class-map match-all AutoQos-4.0-Default-Classify
  match access-group name AutoQos-4.0-ACL-Default
class-map match-any AutoQos-4.0-Signaling
  match  dscp cs3 
  match cos  3 
class-map match-any AutoQos-4.0-Bulk-Data-Queue
  match cos  1 
  match  dscp af11 
  match  dscp af12 
  match  dscp af13 
  match access-group name AutoQos-4.0-ACL-Bulk-Data
class-map match-all AutoQos-4.0-Transaction-Classify
  match access-group name AutoQos-4.0-ACL-Transactional-Data
class-map match-all AutoQos-4.0-Broadcast-Vid
  match  dscp cs5 
class-map match-any AutoQos-4.0-Bulk-Data
  match  dscp af11 
  match  dscp af12 
  match  dscp af13 
class-map match-any AutoQos-4.0-Scavenger-Queue
  match  dscp cs1 
  match cos  1 
  match access-group name AutoQos-4.0-ACL-Scavenger
class-map match-any AutoQos-4.0-VoIP
  match  dscp ef 
  match cos  5 
class-map match-any AutoQos-4.0-Multimedia-Conf
  match  dscp af41 
  match  dscp af42 
  match  dscp af43 
class-map match-any AutoQos-4.0-Control-Mgmt-Queue
  match cos  3 
  match  dscp cs7 
  match  dscp cs6 
  match  dscp cs3 
  match  dscp cs2 
  match access-group name AutoQos-4.0-ACL-Signaling
class-map match-all AutoQos-4.0-Bulk-Data-Classify
  match access-group name AutoQos-4.0-ACL-Bulk-Data
class-map match-any AutoQos-4.0-Trans-Data-Queue
  match cos  2 
  match  dscp af21 
  match  dscp af22 
  match  dscp af23 
  match access-group name AutoQos-4.0-ACL-Transactional-Data
class-map match-any AutoQos-4.0-Multimedia-Stream
  match  dscp af31 
  match  dscp af32 
  match  dscp af33 
class-map match-any AutoQos-4.0-VoIP-Data
  match  dscp ef 
  match cos  5 
class-map match-all AutoQos-4.0-Internetwork-Ctrl
  match  dscp cs6 
class-map match-all AutoQos-4.0-Realtime-Interact
  match  dscp cs4 
class-map match-all AutoQos-4.0-Multimedia-Conf-Classify
  match access-group name AutoQos-4.0-ACL-Multimedia-Conf
class-map match-any AutoQos-4.0-VoIP-Signal
  match  dscp cs3 
  match cos  3 
!
policy-map AutoQos-4.0-Input-Policy
 class AutoQos-4.0-VoIP
 class AutoQos-4.0-Broadcast-Vid
 class AutoQos-4.0-Realtime-Interact
 class AutoQos-4.0-Network-Ctrl
 class AutoQos-4.0-Internetwork-Ctrl
 class AutoQos-4.0-Signaling
 class AutoQos-4.0-Network-Mgmt
 class AutoQos-4.0-Multimedia-Conf
 class AutoQos-4.0-Multimedia-Stream
 class AutoQos-4.0-Transaction-Data
 class AutoQos-4.0-Bulk-Data
 class AutoQos-4.0-Scavenger
policy-map AutoQos-4.0-Output-Policy
 class AutoQos-4.0-Scavenger-Queue
    bandwidth remaining percent 1
 class AutoQos-4.0-Priority-Queue
    priority
    police cir percent 30 bc 33 ms
 class AutoQos-4.0-Control-Mgmt-Queue
    bandwidth remaining percent 10
 class AutoQos-4.0-Multimedia-Conf-Queue
    bandwidth remaining percent 10
 class AutoQos-4.0-Multimedia-Stream-Queue
    bandwidth remaining percent 10
 class AutoQos-4.0-Trans-Data-Queue
    bandwidth remaining percent 10
    dbl
 class AutoQos-4.0-Bulk-Data-Queue
    bandwidth remaining percent 4
    dbl
 class class-default
    bandwidth remaining percent 25
    dbl

The actual interface configuration now shows as:
interface TenGigabitEthernet1/1
 auto qos trust 
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output AutoQos-4.0-Output-Policy

Now let's look at the problems that arise when attempting to configure Auto-QoS for an EtherChannel. The following are the configurations for the 2 physical interfaces with Auto-QoS already applied that we want to bundle into an EtherChannel:
interface TenGigabitEthernet1/15
 switchport mode trunk
 auto qos trust 
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output AutoQos-4.0-Output-Policy
!
interface TenGigabitEthernet1/16
 switchport mode trunk
 auto qos trust 
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output AutoQos-4.0-Output-Policy

This is what happens when attempting to configure the ports into the EtherChannel:
4500X(config)#int range te1/15-16
4500X(config-if-range)#channel-group 1 mode active 
% The attached policymap is not suitable for member either due to non-queuing actions or due to type of classmap filters.

TenGigabitEthernet1/15 is not added to port channel 1
% Range command terminated because it failed on TenGigabitEthernet1/15
4500X(config-if-range)#

The error that IOS gave us says that either the 'non-queuing actions' or the type of class-map filters that are in the QoS policy-map configuration applied to the interface. And the EtherChannel configuration was not applied because of this error:
4500X(config-if-range)#do sh run | b 1/15
interface TenGigabitEthernet1/15
 switchport mode trunk
 auto qos trust 
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output AutoQos-4.0-Output-Policy
!
interface TenGigabitEthernet1/16
 switchport mode trunk
 auto qos trust 
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output AutoQos-4.0-Output-Policy

Ok, so what if we just apply the 'auto qos trust voip' command to the Port-Channel interface itself? Maybe that will work (nope):
4500X(config)#int po1
4500X(config-if)#auto ?            
% Unrecognized command

Damn, that would have been nice Cisco (hint hint). Ok, what happens if we remove the policy-map configurations from the physical interfaces, add them to the EtherChannel, and then reapply the policy-maps to the physical interfaces?
4500X(config-if)# int range te1/15-16
4500X(config-if-range)#auto?
% Unrecognized command
4500X(config-if-range)#a?  
aaa  access-expression  access-group  arp

Still no dice. What if we try to apply the policy-maps to the Port-Channel interface?
4500X(config-if-range)#int po1
4500X(config-if)# service-policy input AutoQos-4.0-Input-Policy
4500X(config-if)# service-policy output AutoQos-4.0-Output-Policy
% A service-policy with queuing actions can be attached in output direction only on physical ports.

4500X(config-if)#do sh run int po1
interface Port-channel1
 switchport
 switchport mode trunk
 service-policy input AutoQos-4.0-Input-Policy
end

Ok, so at least something was configured that time. We also received a different error this time, suggesting that the output policy-map has queuing actions, which is only supported on physical ports. Let's try applying only the output policy-map to the physical interfaces, and leave the input policy on the Port-Channel interface:
4500X(config-if)#int range te1/15-16
4500X(config-if-range)#service-policy output AutoQos-4.0-Output-Policy
% A service-policy with more than one  type of marking field based filters in the  class-map is not allowed on the channel member ports. 

Yet another error. Fun. Now there appears to be an issue with the ACLs that are applied to the class-map.

So to not drag this troubleshooting scenario out any further, these are the limitations for EtherChannel QoS that you need to work around from the Auto-QoS generated policy:

  • Output policing needs to be configured on the Port-Channel interface, and separated from any queuing.
  • Output queuing needs to be configured on the physical interfaces.
  • The class-maps for the queuing policy-map can only have one type of match statement (i.e. an ACL, or matching on QoS tags) per class-map.
  • The policing policy-map cannot use the ‘policing percent’ command.

The following functional QoS policy I adapted for use on EtherChannels I tried to closely match the SRND 4.0 configs (mostly comes from SRND configs):
class-map match-any MULTIMEDIA-STREAMING-QUEUE
  match  dscp af31  af32  af33 
class-map match-any CONTROL-MGMT-QUEUE
  match  dscp cs7 
  match  dscp cs6 
  match  dscp cs3 
  match  dscp cs2
class-map match-any TRANSACTIONAL-DATA-QUEUE
  match  dscp af21  af22  af23 
class-map match-any SCAVENGER-QUEUE
  match  dscp cs1 
class-map match-any MULTIMEDIA-CONFERENCING-QUEUE
  match  dscp af41  af42  af43 
class-map match-any BULK-DATA-QUEUE
  match  dscp af11  af12  af13 
class-map match-any PRIORITY-QUEUE
  match  dscp ef 
  match  dscp cs5 
  match  dscp cs4 

! The police percentage for the default Auto-QoS Output policy is set to 30%, however, in this scenario with the 4500-X and 10Gig interfaces, there isn't a need for 7Gig allocated for voice and video traffic. The example below of 2Gig is 10% of the EtherChannel aggregate bandwidth (20G). Adjust accordingly for your needs.

policy-map OUTPUT-PRIORITY-POLICING-EC
 class PRIORITY-QUEUE
    police cir 2000000000

policy-map OUTPUT-QUEUING-NOPOLICING-EC
 class PRIORITY-QUEUE
    priority
 class CONTROL-MGMT-QUEUE
    bandwidth remaining percent 10
 class MULTIMEDIA-CONFERENCING-QUEUE
    bandwidth remaining percent 10
 class MULTIMEDIA-STREAMING-QUEUE
    bandwidth remaining percent 10
 class TRANSACTIONAL-DATA-QUEUE
    bandwidth remaining percent 10
    dbl
 class BULK-DATA-QUEUE
    bandwidth remaining percent 4
    dbl
 class SCAVENGER-QUEUE
    bandwidth remaining percent 1
 class class-default
    bandwidth remaining percent 25
    dbl

interface Port-channel1
 switchport
 switchport mode trunk
 service-policy input AutoQos-4.0-Input-Policy
 service-policy output OUTPUT-PRIORITY-POLICING-EC

interface TenGigabitEthernet1/15
 switchport mode trunk
 channel-group 1 mode active
 service-policy output OUTPUT-QUEUING-NOPOLICING-EC
!
interface TenGigabitEthernet1/16
 switchport mode trunk
 channel-group 1 mode active
 service-policy output OUTPUT-QUEUING-NOPOLICING-EC

CCIE RS Written Exam - Passed (Again)

Since the last post of reviving this blog, I have been studying for the CCIE RS Written exam again to recert my existing Professional-level certifications, as well as being qualified to attempt the lab again. Good news is I passed the written exam a week ago! Some renewed interest again in going for the IE again, so we'll see how the studies go in the next few months.

In other news I have been exploring other options for the blog host to get better design and layout options. Wix.com seemed like a great alternative to Blogger, but their 'blog' widget is still a work in progress. Hopefully soon!

Thursday, June 27, 2013

Blog Update

So it's hard to believe, but it's been 2 years since I've last posted on here. Lots of things have happened since then - professional and personal.

I did end up taking the CCIE RS Lab in November of 2011, but unfortunately did not pass. The Troubleshooting section was very difficult, even coming from someone who considers themselves skilled in the art-of-tshoot, but the Config section was very reasonable. I've been pecking at re-studying to retake the lab, but I've been lead engineer in a couple large data center upgrades that have consumed a LOT of my time.

I almost daily weigh the benefits of committing several months+ to studying for this cert, versus using that time learning other stuff (security/wireless/DC, programming, Linux). Now with the prospects looking very high for SDN, it's starting to make me see the decline in the value of the CCIE (for me personally).

Back in May of 2013, I purchased the lah.io domain a few hours after Google annouced they were promoting the status of the .io TLD for search (to the same level as .com, etc), and was able to pickup this 3-character domain out of the last hundred or two still available (lucked out!).

A bit of a renewed interest in blogging again, however, posts will be focused more on day-to-day technology and interesting bits learned from my consulting job (for a major Cisco partner).

-Mark

Saturday, July 23, 2011

TRILL

Well, it's been a while since I've updated this blog, but my CCIE studies continue, and I'm always on the lookout for new and exciting advances in the networking arena. Today marks the official 'release' or ratification of the new protocol TRILL (also known as Routing Bridges or RBridges). Here are the RFCs that were just released that relate to TRILL:

Routing Bridges (RBridges): Base Protocol Specification
Transparent Interconnection of Lots of Links (TRILL) Use of IS-IS
Routing Bridges (RBridges): Adjacency

In short, what TRILL accomplishes is link-state routing (IS-IS) for Layer 2 Ethernet MAC addresses in a LAN, which eliminates the need for the Spanning-Tree Protocol. It is not designed to span outside of a LAN.

TRILL has been implemented in the Nexus 7000 line for some time now, and is also of discussion as a possible component of the much anticipated Juniper QFabric. It is one of the hot topics of discussion for next generation data center designs. The reason I say 'next generation' is the fact that I have yet to see or read about it implemented in production.

Here's a pretty good excerpt from the RFC that gives a general overview of how TRILL works:

RBridges run a link state protocol amongst themselves. This gives
them enough information to compute pair-wise optimal paths for
unicast, and calculate distribution trees for delivery of frames
either to destinations whose location is unknown or to
multicast/broadcast groups [RBridges] [RP1999].

To mitigate temporary loop issues, RBridges forward based on a header
with a hop count. RBridges also specify the next hop RBridge as the
frame destination when forwarding unicast frames across a shared-
media link, which avoids spawning additional copies of frames during
a temporary loop. A Reverse Path Forwarding Check and other checks
are performed on multi-destination frames to further control
potentially looping traffic (see Section 4.5.2).

The first RBridge that a unicast frame encounters in a campus, RB1,
encapsulates the received frame with a TRILL header that specifies
the last RBridge, RB2, where the frame is decapsulated. RB1 is known
as the "ingress RBridge" and RB2 is known as the "egress RBridge".
To save room in the TRILL header and simplify forwarding lookups, a
dynamic nickname acquisition protocol is run among the RBridges to
select 2-octet nicknames for RBridges, unique within the campus,
which are an abbreviation for the IS-IS ID of the RBridge. The
2-octet nicknames are used to specify the ingress and egress RBridges
in the TRILL header.

Multipathing of multi-destination frames through alternative
distribution trees and ECMP (Equal Cost Multipath) of unicast frames
are supported (see Appendix C).

Tuesday, April 12, 2011

Cisco Releases IOU to the Masses

So finally Cisco has released their coveted IOU (IOS On Unix) that has been internal only to Cisco for many years. In a nutshell, it is an emulator that can run any normal IOS code as it would on actual equipment.

The catch now is that they are releasing it under Cisco Learning Labs, and you can't just download the software yourself to use, you must rent labs like you would for an actual rack of Cisco gear for IE studying. Still not good enough Cisco.

Currently, they only have labs for CCNA ICND2, CCNP ROUTE, CCNP SWITCH, CCNP TSHOOT and CCIP MPLS, which will cost $75 for 25 hours (only $50 for CCNA lab). Additional 5 hour blocks can also be added for an additional $20.

More information here:
http://newsroom.cisco.com/dlls/2011/prod_041211.html