We Don't Need No Stinkin' Flags! ACI External EPG Subnet Flags...Just for Fun!

We Don't Need No Stinkin' Flags! ACI External EPG Subnet Flags...Just for Fun!

by Micheline Murphy

 

In ACI, the L3Out is a veritable Howl’s Moving Castle[i] of configuration whose ultimate goal is to deliver external connectivity to the endpoints in the ACI fabric. All told, I think there are something in excess of twenty steps to go from zero to full connectivity between an outside subnet and internal EPG members. That includes configuring all of the pre-requisites needed to support a L3Out, all of the steps that enable internal EPGs to be able to share their own subnets, and all the contract config between EPGs. Representing the whole thing is the external EPG, which might possibly be the single most complicated object in the whole curious and delicate complex.

 

In this latest installment of ...Just for Fun, I take a deep dive into the external EPG and explore each of its eight flags.

 

Topology

As always, I like to start with a tour of the local topology. Here you go.

 

In this topology, Leaf 101 and Leaf 102 belong to the ACI fabric. I just teased the two border leaf switches out of the cloud so we could see the important bits. As you can see, this physical topology will require the building of two L3Outs. Unsurprisingly, I called one L3Out_via_ASR-a and the other L3Out_via_ASR-b. Both L3Outs are associated with the same tenant, Bluefish.

 

There are four subnets involved—two /31 subnets for transit between the border leaf and its peer ASR router, and two /24 subnets that accessible via either ASR. For ACI, I’m using Release 3.2(4e) and the ASRs are Cisco 1002 IOS-XE Release 15.5(3)S4a.

 

I’m not going to go through the nitty-gritty of building the L3Out, but if you are interested in building an L3Out, I covered the topic in “Walking on the Wild Side: ACI External Layer 3 Networks...Just for Fun”.[ii]  Here, our starting point is that both L3Outs work and are passing routes.[iii]  L3Out_via_ASR-a uses OSPF and L3Out_via_ASR-b uses eBGP over OSPF.

 

The Flag that is No Flag: Import Route Control Enforcement

First, let’s talk about the flag that isn’t. And that is Import Route Control Enforcement. Import Route Control Enforcement is an innocuous looking little check box that’s easy to skip over when you’re configuring the L3 Outside. If you look about halfway down, just before the VRF, you’ll find the little critter.

 

It’s checked here, but by default it is not. The default behavior (that is, IMPORT = False, or unchecked) is for ACI to import all routes advertised to it from any peers on this L3Out. When the box is checked (IMPORT = True), ACI will only import specifically tagged routes.

 

Messing with Import Route Control Enforcement is not recommended, but if for some reason you need to lock down what routes come into your ACI fabric, you will need to be able to configure the corresponding flag that lets routes come into your fabric. That flag is called Import Route Control Subnet, and it is the first flag we will cover. You configure the Import Route Control Subnet flag on the External Network Instance Profile, external EPG, for short. If you look at the screenshot below, you can see where you need to navigate.

 

From here, you scroll down the Work Pane until you see Subnets. Double-clicking a subnet will bring you to a pop-up window where all of our external EPG subnet flags reside.  Like this:

 

To configure the Import flag, check the box. Hit the submit button. Easy-peasy. But more important than just being able to configure this flag, we need to know what it does. With IMPORT = True, you must have this flag to identify any subnet you want ACI to learn from external neighbor. Let’s take a look at the border leaf routing table with IMPORT = True and with no flag.

 

apic1# fab 101 show ip route 172.50.1.0 vrf Bluefish:VRF1

----------------------------------------------------------------

Node 101 (aci1-leaf-101)

----------------------------------------------------------------

IP Route Table for VRF "Bluefish:VRF1"

'*' denotes best ucast next-hop

'**' denotes best mcast next-hop

'[x/y]' denotes [preference/metric]

'%<string>' in via output denotes VRF <string>

 

Route not found

 

... And now after the flag is checked.

 

apic1# fab 101 show ip route 172.50.1.0 vrf Bluefish:VRF1

----------------------------------------------------------------

Node 101 (aci1-leaf-101)

----------------------------------------------------------------

IP Route Table for VRF "Bluefish:VRF1"

'*' denotes best ucast next-hop

'**' denotes best mcast next-hop

'[x/y]' denotes [preference/metric]

'%<string>' in via output denotes VRF <string>

 

  1. 172.50.1.0/24, ubest/mbest: 1/0

*via 172.30.0.2, eth1/48, [110/41], 00:00:08, ospf-default, intra

 

We DO Need Some Stinking Flags: The Default Flag

If you go to add a new subnet to the external EPG, there’s always one flag that starts off as checked, the default flag, external subnets for the External EPG. This flag associates subnets with the external EPG. Without it, routes might pass, but traffic won’t be allowed because no contract will recognize the subnet as belonging to an EPG.

 

Let’s take a deeper look by examining 172.50.1.0, the subnet from ASR-a. I’ve gone and taken off all of its flags. First, we can see that the ACI fabric clearly receives the route from ASR-a. We can confirm that by both GUI and CLI. In the GUI, we can navigate to the OSPF Routes folder under the Configured Node of L3Out_via_ASR-a.

 

From the CLI, we and use the command fabric 101 show ip route 172.50.1.0 vrf Bluefish:VRF1.

 

apic1# fab 101 show ip route 172.50.1.0 vrf Bluefish:VRF1

----------------------------------------------------------------

Node 101 (aci1-leaf-101)

----------------------------------------------------------------

IP Route Table for VRF "Bluefish:VRF1"

'*' denotes best ucast next-hop

'**' denotes best mcast next-hop

'[x/y]' denotes [preference/metric]

'%<string>' in via output denotes VRF <string>

 

  1. 172.50.1.0/24, ubest/mbest: 1/0

*via 172.30.0.2, eth1/48, [110/41], 00:40:41, ospf-default, intra

 

Using either the GUI or the CLI, we can confirm that a contract has been configured between the external EPG and an internal EPG, EPG-A. Here’s the GUI confirmation that there is a contract between the external EPG and EPG-A.

 

From the CLI, we need to cross reference the pcTags of each EPG. Recall that in ACI, each packet is tagged with an identifier linking it to an EPG. That tag is called the pcTag and can be found on the Policy page of either EPG, about six lines down from the top. In our example, the external EPG is 16395 and EPG-A is 16388. And below, you can see that there is a rule (aka a contract) permitting traffic between our external EPG and EPG-A.

 

apic1# fabric 101 show zoning-rule src-epg 16395

----------------------------------------------------------------

Node 101 (aci1-leaf-101)

----------------------------------------------------------------

Rule ID SrcEPG      DstEPG FilterID    operSt Scope        Action        Priority

=======      =====        ======      ========    ====== ===== ====== ========     

4112 16395        16394 9 enabled 2260992 permit fully_qual(7) 

4138 16395        16388 5 enabled 2260992 permit        fully_qual(7)

 

But, a ping from 172.50.1.1 to a VM within EPG-A (172.30.1.101) fails.

 

asr1002-a#ping

Protocol [ip]:

Target IP address: 172.30.1.101

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Ingress ping [n]:

Source address or interface: 172.50.1.1

Type of service [0]:

Set DF bit in IP header? [no]:

Validate reply data? [no]:

Data pattern [0x0000ABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Sweep range of sizes [n]:

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.30.1.101, timeout is 2 seconds:

Packet sent with a source address of 172.50.1.1

.....

Success rate is 0 percent (0/5)

 

Why? Because ACI doesn’t recognize 172.50.1.1 as “belonging” to the external EPG. Since the subnet doesn’t belong to the external EPG, the contract between EPG-A and the external EPG doesn’t allow the traffic. When we click the external subnet for external EPG flag, that changes. And voilá! We have connectivity!

 

asr1002-a#ping       

Protocol [ip]:

Target IP address: 172.30.1.101

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Ingress ping [n]:

Source address or interface: 172.50.1.1

Type of service [0]:

Set DF bit in IP header? [no]:

Validate reply data? [no]:

Data pattern [0x0000ABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Sweep range of sizes [n]:

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.30.1.101, timeout is 2 seconds:

Packet sent with a source address of 172.50.1.1

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

 

Going from Point A to Point B: The Export Flag

Transit routing is used when the ACI fabric is connected to multiple domains that need to be able to communicate with each other. In order to do so, these external domains need to be able to use the ACI fabric for transit. For example, the ACI fabric might be connected to a legacy mainframe at one L3Out and the WAN router connected to another L3Out. In order for the mainframe to have connectivity to a remote data center or the Internet (via the WAN connection), the ACI fabric has to provide transit routing for the two domains. The next flag I want to talk about is the export flag, and you will see that this is THE flag that enables transit routing.

 

Here’s the flag. As with the external EPG flag, it’s a simple click and submit to configure, but the trick is knowing what the flag does.

 

This flag tells ACI to advertise this route out of the fabric at a certain L3Out. In our topology, we would want to advertise 172.50.1.0 out to ASR-b, and 172.50.2.0 out to ASR-a in order to create a full transit path through the ACI fabric. That is, we want 172.50.1.1 to be able to ping 172.50.2.1.

 

In order to achieve this result, we need to configure four subnets on each external EPG each with a different flag. And actually, this provides a great example of how the external subnet flag and the export route control flag work. Let’s take a look at L3Out_via_ASR-b’s subnets. First, here are the transit subnets.

 

The first subnet is from ASR-a. You need ASR-b to have a route to this subnet so that it knows how to get to its next-hop. So, it will get the export route control flag. The second subnet is ASR-b’s attached transit subnet. ACI needs to have this subnet as part of the external EPG so it has a next-hop to ASR-b’s attached subnet, 172.50.2.0. So, the second subnet has the default flag on it, the external subnet flag.

 

As you might guess, the external EPG for the L3Out associated with ASR-a has the exact same subnets, but with their flags reversed. With all four subnets and the proper flags, the end-to-end ping works.

 

asr1002-a#ping

Protocol [ip]:

Target IP address: 172.50.2.1

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Ingress ping [n]:

Source address or interface: 172.50.1.1

Type of service [0]:

Set DF bit in IP header? [no]:

Validate reply data? [no]:

Data pattern [0x0000ABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Sweep range of sizes [n]:

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.50.2.1, timeout is 2 seconds:

Packet sent with a source address of 172.50.1.1

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/3 ms

 

Bailing out a Leaky Routing Table: The Shared Flags

Sometimes, an ACI design calls for an L3Out in one tenant to be used by EPGs in another tenant. ACI calls this design a shared L3Out. Network engineers will recognize the result as route leaking.

 

In order for an EPG to use an L3Out configured in a different tenant, a few things have to be configured. As you might guess, you need to configure a contract between the external EPG and the EPG wanting to use the L3Out. As with any inter-tenant communication, the contract is created in one tenant—the L3Out’s tenant—and is exported to another tenant—the EPG’s tenant.[iv]  The EPG’s subnet scope has to be adjusted to allow it to be shared externally.

 

Lastly, you need the proper flags on the external EPG subnets. There are two shared flags. The first, the shared route control subnet, identifies a subnet to leak to the consumer VRF. The second shared flag, the shared security import subnet, places a subnet into the external EPG for purposes of the exported contract between the external EPG and the consumer EPG.

 

Let’s take a look at some shared flags in action. In this case, we want to have VM-4, which belongs to EPG-1, to be able to ping 172.50.1.1, attached to ASR-a. EPG-1 is in a different tenant, named Onefish.

 

Here are the flags.

 

You can see here that this subnet attached to ASR-a has its other flags so as not to break connectivity with endpoints in Bluefish, but in addition, the subnet now sports two new flags. The shared route control flag tells ACI to leak this route into Onefish’s VRF. Doing a show ip route before and after confirms the leak.

The second shared flag tells ACI to include this subnet in the external EPG for purposes of the contract between Onefish’s EPG and the external EPG. Since we’re leaking routes, both the transit subnet (in this case 172.30.0.2/31) and the attached subnet (172.50.1.0/24) need both shared flags in order for VM4 to ping 172.50.1.1. And a ping before and after the flags are placed, verifies connectivity.

 

[cisco@aci1-onefish-app-1-epg1-vm4 ~]$ ping 172.50.1.1 -c 5

PING 172.50.1.1 (172.50.1.1) 56(84) bytes of data.

 

--- 172.50.1.1 ping statistics ---

5 packets transmitted, 0 received, 100% packet loss, time 3999ms

 

[cisco@aci1-onefish-app-1-epg1-vm4 ~]$ ping 172.50.1.1 -c 5

PING 172.50.1.1 (172.50.1.1) 56(84) bytes of data.

64 bytes from 172.50.1.1: icmp_seq=1 ttl=252 time=0.317 ms

64 bytes from 172.50.1.1: icmp_seq=2 ttl=252 time=0.293 ms

64 bytes from 172.50.1.1: icmp_seq=3 ttl=252 time=0.273 ms

64 bytes from 172.50.1.1: icmp_seq=4 ttl=252 time=0.260 ms

64 bytes from 172.50.1.1: icmp_seq=5 ttl=252 time=0.277 ms

 

--- 172.50.1.1 ping statistics ---

5 packets transmitted, 5 received, 0% packet loss, time 4000ms

rtt min/avg/max/mdev = 0.260/0.284/0.317/0.019 ms

 

Ground Zero: The Aggregate Flags

Up to now, all of the examples I’ve used have shown the external EPG configured with an actual subnet or subnets. But there is an alternative. You can use the all-zeros route.  The last three flags should only be used if you are configuring your external EPG with 0.0.0.0/0. They are the Aggregate Import, Aggregate Export, and Aggregate Shared Routes flags. As you might guess, these flags control the import and export of the all-zeros route within a tenant, and between tenants. Let’s look.

 

Here is the all-zeros route I configured on the external EPG for L3Out_via_ASR-a.

 

You can see that I’ve left the Import Route Control alone, so the Import Route Control flag is not needed. But the external subnets for external EPG flag and the external route control subnet flag are both selected. The aggregate export flag is selected, as well. As with enumerated subnets, the external subnets for external EPG places this subnet (AKA all subnets) into the external EPG for contract purposes, and the external route control subnet flags this subnet for advertisement out of this L3Out.

 

The new flag, the aggregate export flag works in conjunction with the export route control subnet flag. It tells ACI to export all subnets that match 0.0.0.0/0 (AKA all subnets). If you use the all-zeros route without this flag, only the default gateway is exported.

 

Lastly, in order to make the L3Out work across tenants, the aggregate shared flag is also selected. From ASR-a’s POV, it can reach both the Bluefish VM and the Onefish VM that we were able to reach before.

 

asr1002-a#ping

Protocol [ip]:

Target IP address: 172.30.1.101

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Ingress ping [n]:

Source address or interface: 172.50.1.1

Type of service [0]:

Set DF bit in IP header? [no]:

Validate reply data? [no]:

Data pattern [0x0000ABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Sweep range of sizes [n]:

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.30.1.101, timeout is 2 seconds:

Packet sent with a source address of 172.50.1.1

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

asr1002-a#ping

Protocol [ip]:

Target IP address: 172.30.10.101

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Ingress ping [n]:

Source address or interface: 172.50.1.1

Type of service [0]:

Set DF bit in IP header? [no]:

Validate reply data? [no]:

Data pattern [0x0000ABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Sweep range of sizes [n]:

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 172.30.10.101, timeout is 2 seconds:

Packet sent with a source address of 172.50.1.1

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/1 ms

 

Practice Tip: The all-zeros route is the only time when the external subnet flag and the export route control flag are checked on the same subnet. All other subnets will get one flag or the other, but not both.

The original Ground Zero was the designation for a place on earth directly above or below an exploding nuclear bomb. Just as you do not want to be at a real Ground Zero, using the all-zeros route for an external EPG subnet has its dangers. Specifically, you get bomb-like conditions when you mix OSPF, eBGP, and the all-zeros route, such as in a transit routing situation or even on a single L3Out using eBGP over OSPF.

 

If you will remember, I configured L3Out_via_ASR-b using eBGP over OSPF. Let’s take a look at what happens when I configure the all-zeros route on that L3Out exactly like I configured the L3Out_via_ASR-a above, with the external subnet flag, the export route control flag, and the aggregate export flag.

 

asr1002-b#show ip route 3.0.0.102

Routing entry for 3.0.0.102/32

  Known via "bgp 50", distance 20, metric 0

  Tag 30, type external

  Last update from 3.0.0.102 00:00:29 ago

  Routing Descriptor Blocks:

  * 3.0.0.102, from 3.0.0.102, 00:00:29 ago

Route metric is 0, traffic share count is 1

      AS Hops 2

Route tag 30

      MPLS label: none

asr1002-b#

Apr  8 17:33:58: RT: recursion error routing 3.0.0.102 - probable routing loop

Apr  8 17:34:13: RT: recursion error routing 3.0.0.102 - probable routing loop

 

 

What’s going on here? A quick look at the debug ip routing messages tells us.

 

Apr  8 17:39:03: RT: updating bgp 3.0.0.102/32 (0x0)  :

    via 3.0.0.102  0 1048577

 

Apr  8 17:39:03: RT: closer admin distance for 3.0.0.102, flushing 1 routes

Apr  8 17:39:03: RT(multicast): delete subnet route to 3.0.0.102/32

Apr  8 17:39:03: RT: add 3.0.0.102/32 via 3.0.0.102, bgp metric [20/0]

Apr  8 17:39:03: RT: updating bgp 172.30.0.4/31 (0x0) :

    via 3.0.0.102  0 1048577

 

OSPF’s route to 3.0.0.102 is via a legitimate next-hop, but when OSPF is redistributed into eBGP, the route’s next-hop is changed, from a legitimate next-hop, to the IP address of the eBGP peer, which in this case is 3.0.0.102. This change causes the route to 3.0.0.102 to point to itself. Since eBGP has a better administrative distance than OSPF, the correct OSPF route gets deleted in favor of the loopy eBGP route. Since 3.0.0.102 is the border leaf’s router ID for eBGP, this defective route breaks all connectivity.

 

There are a couple ways to work around this problem, but the easiest way is to avoid using the all-zeros route in situations where you have OSPF redistributed into eBGP. For this reason, using the all-zeros route in transit routing is not recommended, and comes with a bunch of limitations.[v]

 

Bringing It All Home

Wow! That was really a lot of information! Let’s bring it home in something a little more digestible. Please feel free to use this chart, with proper attribution, of course.

Routing Type

Use Case

Flags

External Routing

Traffic between internal EPGs and external subnets.  L3Out and internal EPGs are in the same tenant.

  • External subnet for external EPG

Import Route Control

You need to control which external routes the ACI fabric learns.

  • Import Route Control Enforcement (on the L3 Outside)
  • Import Route Control Subnet

Transit Routing

Traffic between two external subnets needs to cross the ACI fabric.

  • External Subnet for external EPG
  • Export Route Control Subnet

Route Leaking

EPGs in one tenant need to use the L3Out in another tenant.

  • Shared Route Control Subnet
  • Shared Security Import Subnet

All-zeros subnet (0.0.0.0/0)

External EPG configured with the all-zeros route.  The all-zeros route is not recommended for transit routing.

  • Aggregate Import (if using Import Route Control Enforcement)
  • Aggregate Export
  • Aggregate Shared Routes

 

That’s it for now, but I would love to hear your comments, questions, and adventures in ACI. Please share!

 

 

 

 


[i] Howl’s Moving Castle (2004) by celebrated animation director Hayao Miyazaki, told the story of a young woman in a war-torn country who is cursed and must go on a quest to free herself.  Along the way, she meets the reclusive but very powerful wizard, Howl, who lives in a steampunk-y enchanted moving castle.  Why don’t you know this?

[ii] Walking on the Wild Side: ACI External Layer 3 Networks... Just for Fun

[ii] A quick comment about transit routing. While individual L3Outs will support any of a wide range of routing options, there are only some routing combinations that will support transit routing.  Discussing those combos is beyond the scope of this article, but if you want to check it out, Cisco APIC Layer 3 Configuration Guide, Release 3.x and Earlier--Transit Routing for more details.

[iv] Going into how to configure an inter-tenant contract is beyond the scope of this article, but if you’re dying of curiosity, check out An Offer You Can't Refuse:  ACI Contracts... Just for Fun!

[v] See the link in endnote iii.