by Micheline Murphy
Virtual port channels are the tricksters of the networking world. They were originally developed as a con game, way to link two switches together and fool the rest of the network into thinking they were only a single switch. The biggest driver for developing vPC technology was to avoid having spanning-tree come along and choke off half of your links.[i] You can see the remnants of vPC’s pirate past in some of the features available to it such as peer-switch, which allows both vPC peers to use the same MAC address for spanning-tree.
In today’s data center, spanning tree isn’t as much of an issue, but high availability and redundancy is. vPCs provide both. In this latest installment of “…just for fun!” let’s take a look at a fully functioning virtual port channel in a number of data center environments. As with my other blogs, I'm not going to go deep into the technology, but if you want a really good, deep read on vPC, I would recommend Chapter 6 of Data Center Virtualization Fundamentals, by Gustavo Santana.
As always, let’s start with some topology. One of vPC’s limitations is that it is only supported in one physical topology—a triangle. Two switches and a downstream device. The downstream device is tricked into thinking it is only connected to one device by a port-channel.
The other thing to remember about vPCs is that it is a Layer 2 technology. The con game that a vPC pair plays only works at Layer 2. vPC has no mechanism to maintain the illusion at Layer 3, so Layer 3 devices will see two switches. So, while L3 support is available over a vPC, as the saying goes, “just because you can do a thing doesn’t mean you should.”[ii]
The Big Mamma of Show Commands: show vpc
Unlike a lot of other technologies that hide their goodies like Easter eggs with a billion different show commands, there is one Big Mamma of show commands for virtual port channels. Let’s look at it, and just start off things light and simple, this is a vPC that has been configured in complete isolation. Here it is in all its naked glory. show vpc.
There's a ton of great information here, so let's just start at the top. The most important thing is to be able to determine the status of your vPC pair. When the vPC keep-alive status says "peer is alive" that means that peer-keepalive is functional, and "peer adjacency formed ok" means that the peer-link is up. A fully-fit vPC pair will have "success" on all of its consistency checks.
The next most important thing is to be able to determine what role the local switch is playing. You can see above that this switch is the secondary. Or, you can use the command show vpc role, which I’ve shown below... along with the commands you need to switch the switch role.
Whether a switch is primary or secondary is pretty important to understanding how a switch will behave in a failover scenario. Sometimes you will see the role as "primary, operational secondary" or "secondary, operational primary." This means that there has been a failover event and the switch is now behaving as whatever operational role is indicated. Why is this so? vPC doesn't have any pre-emption mechanism. Role elections only happens when the peer-keepalive is established, not when the peer-link flaps.[iii] Statically configuring vPC role is disruptive.
Moving right along, you can see that sh vpc also gives you a listing of various vPC-specific features and whether they are on or off. Peer-gateway works in conjunction with HSRP, and configures both peers to forward traffic destined for the default gateway. Dual-active exclude VLANs are VLANs whose SVIs have been configured to not shut down in the event that the secondary peer is shut down. Graceful consistency (which is enabled by default) enables the primary peer to remain operational in the event of a Type 1 inconsistency that shuts the vPC down. Peer-router enables a Layer 3 device to form a peering with both peers, and auto-recovery (which is also enabled by default) permits a single switch to restore the vPC in the event that its peer fails to come online. Note that this list might vary slightly from platform to platform. For example, a N9000 switch running Release 7.0(3)I4(2) will also tell you about its delay timers. The output above is from a Nexus 5K running Release 7.3(1)N1(1) code.
Next, sh vpc tells you about the health of the peer-link and its vPC member ports. You want to make sure that your peer-link is up and trunking all VLANs on any member port.
One thing to keep in mind is that vPC has a horrifically long timer for bringing up member ports after a failed peer-link has been restored. The minimum is 240 seconds. That means that for four minutes you will get a sh vpc output that shows the peers are alive, the peer-link is up, and all the consistency checks have passed, but the members are all "down*".
Just be patient. Go to the bathroom, refill your coffee, and eat a stroopwafel.
Consistency Counts: Looking at the Consistency Checks
Wow! That was a lot, just out of one command! The other big vPC-specific command is sh vpc consistency-parameters. This command has a whole series of keywords so that you can dial in the output, but if your vPC isn't coming up because of Type 1 consistencies, or you're losing VLANs because of Type 2 consistencies, this command is your go-to. The outputs can be pretty lengthy, so I'm only going to show you one.
Here you can see that I used a couple keywords to output VLAN specific consistency checks for port-channel 101. The way you want to read this output is just by going down the two right-hand columns (Local Value and Peer Value) and comparing. A fully functional vPC will have identical outputs in each column. If you don't have identical outputs, your consistency check fails. Depending on whether the check is a Type 1 or Type 2 (which is listed in the second column from the left), the vPC will either fail or the offending VLANs will be suspended. The output also very helpfully reminds you that Type 1 inconsistencies will cause the vPC to be suspended.
Wait a second! What’s going on here?! As you can see, this output does not meet the stare-n-glare test. The two columns are very different. Specifically, the local switch allows VLANs 101, 102, and 103 while the remote peer allows only VLANs 101 and 102. The result is that VLAN 103 is pruned from the remote switch. Like so:
Now let's move away from isolation and put this vPC in some more real-world situations. AKA, time to break stuff!
When Things Go All Pear-Shaped: vPC Failure Scenarios
Quite possibly, the most important thing you need to know about vPCs is how does vPC behave when things go all pear-shaped. And specifically, what happens when the peer-link &/or the peer-keepalive fail.
But before we even get to the dreaded split brain, let's understand the mechanisms in place to prevent a catastrophic failure of connectivity when a vPC peer fails. If the peer-link fails, the vPC peers (presumably) still can reach each other via the keep-alive. Here’s the blow-by-blow of this failure scenario:
- The peer-link fails.
- The secondary switch pings the primary via the keep-alive. If the ping comes back, the secondary shuts down all of its member ports and SVIs. If the ping doesn't come back, the secondary assumes the primary is dead and assumes primary position.
Here’s what that process looks like from the secondary switch’s POV:
pod1-leaf2(config)# 2019 Mar 25 15:54:46 pod1-leaf2 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel1: first operational port changed from Ethernet1/1 to Ethernet1/2
2019 Mar 25 15:54:46 pod1-leaf2 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel1: Ethernet1/1 is down
2019 Mar 25 15:54:46 pod1-leaf2 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel1: first operational port changed from Ethernet1/2 to none
2019 Mar 25 15:54:46 pod1-leaf2 %ETHPORT-5-IF_DOWN_LINK_FAILURE: Interface Ethernet1/1 is down (Link failure)
2019 Mar 25 15:54:50 pod1-leaf2 %VPC-2-VPC_SUSP_ALL_VPC: Peer-link going down, suspending all vPCs on secondary. If vfc is bound to vPC, then only ethernet vlans of that VPC shall be down.
Split brain occurs when the peer-link goes down, and for whatever reason (usually bad design) both peers think that they're primary. That is, the keepalive also fails. The symptoms of this scenario include black holing traffic, congestion, L2 loops, and (believe it or not) nothing. If you have a worst-case scenario, in which both peers remain up, then both peers will interact with STP according to normal STP rules. If you have both switches advertising the same system ID (using the peer-switch command) and you followed vPC best practice and made the vPC peer pair the root then the whole STP domain will flap as root switch appears to be moving back and forth between peers. If there's more STP domain northbound of the vPC, and the peers aren't the root bridge, you will still get flap as the STP domain keeps trying to converge on a seemingly ever-changing topology. Either way, things are no bueno.
How does a L2 loop occur? Let's say that we have Host1 connected to SW1 and SW2, which were our vPC pair, but now they are split brained. SW1 and SW2 are dual connected to SW3 and SW4, which are our L2/L3 gateways. Host1 needs a MAC address so ARPs for it. The ARP goes to SW1 and SW2, our broken vPC pair. Both SW1 and SW2 flood the ARP request out all ports. The ARP request goes up to SW3 and SW4, twice. When SW3 and SW4 receive the ARP, it'll get flooded back down to SW1 and SW2.... round and round we go.
Best practice is to avoid the split-brain scenario all together.
- Configure the peer-link and the keep-alive from entirely independent resources. Different line cards, different VRFs, different networks. This is why best practice recommends that the keep-alive be in the management VRF.
- Use a port-channel for the keep-alive. Although the keep-alive works just fine on a single link, port-channels are much hardier.
- Use a dedicated sub-interface for the keep-alive. This can be used in conjunction with a port-channel.
- Create a dedicated SVI for the keep-alive using a non-vPC VLAN with an independent L2 link.
Finally, if you have a failure event in your vPC, THE Timer is going to kick in! Remember what I said earlier about THE Timer?
Whew! Let's take a breather!
In my next installment of "...Just for Fun!" I'll tackle vPCs in a VXLAN and ACI environment. But in the meantime, please share your experience with vPC and its pirate past.
[i] See for example, Data Center Virtualization Fundamentals by Gustavo Santana (2014). The chapter covering virtual port channels is entitled “Fooling Spanning Tree.”
[ii] “To connect Layer 3 devices to a vPC domain, use Layer 3 links from Layer 3 devices to connect to each vPC peer device.” Cisco Nexus 9000 Series NX-OS Interfaces Configuration Guide, Release 7.x (2018) See also, Design and Configuration Guide: Best Practices for Virtual Port Channels on Cisco Nexus 7000 Series Switches (2016); Supported Topologies for Routing over Virtual Port Channels on Nexus Platforms.