This document aims to demonstrate the importance of BGP SoO (Border Gateway Protocol Site of Origin) codes in ensuring a loop-free network in particular multi-homed MPLS Layer 3 VPN sites. BGP SoO is a tag that is appended on BGP updates to allow a peer to mark a particular peer as belonging to a particular site. In certain MPLS L3 VPN configurations, the BGP AS-Path may not provide the granularity needed prevent a loop in the control-plane. With this in mind, BGP SoO is designed to fill this gap and prevent a routing loop that may occur.
- Understanding of BGP Operations
- Understanding of BGP and IGP interaction
- Understanding of basic redistribution principles
- Knowledge of MPLS L3 VPNs is desirable though not mandatory
The Problem? - Looping Packets
To help demonstrate how looping packets are the result of a BGP advertisement error I will be using the following topology.
This is a typical MPLS L3 VPN multi-homed site where multiple routers are peering with the provider and have a mutual IGP on the backend (In this case, EIGRP). In this scenario CE1 is advertising it's loopback of 22.214.171.124/32 into BGP. This gets sent up to PE1 who then advertises this to PE2 via the VPNv4 peering. This is then advertised down to CE2. Since CE2 has been configured to allow its own BGP AS in (As a result of duplicate Autonomous Systems being used by the provider) it will install this route into the BGP RIB. This is route is then redistributed and advertised back to CE1.
Would this itself cause a loop? No. Why? CE1 has multiple routing sources to reach 126.96.36.199/32. One that is directly connected via Loopback1 (AD of 0). Or an external EIGRP route from CE2 (AD 170). Since the directly connected link has the lower AD this is used instead. Therefore when you look at the EIGRP topology table you will see that the feasible distance for the redistributed route is infinity.
If we conduct a trace from CE2 we can see that there is no loop in the control plane. There is however suboptimal routing as a result of the traversal of the provider network but no routing loop.
The issue starts to arise when the connected route fails on CE1. For example, a shutdown of the loopback could cause this. The result is that EIGRP will immediately look for a feasible successor route for the prefix. Since the previous prefix is still in the topology table, CE1 immediately switches over and uses the new route. CE1 then withdraws and readvertises the 188.8.131.52/32 prefix with a new next-hop address. This then gets sent over to CE2 through the provider network who redistributes it into EIGRP down to CE1. The issue is that 184.108.40.206/32 prefix is no longer reachable yet it won't get removed from the routing table.
With the loopback shutdown. We can see the EIGRP external route is now installed and the BGP update has a new next-hop applied. Previously it was 0.0.0.0 as the route was locally connected.
The result of this has caused a permanent routing loop that will never resolve where the prefix 220.127.116.11/32 exists.
Note: The timeouts are the PE routers. They were unable to send the ICMP port unreachable messages due to not having a route back to 18.104.22.168/32 which is the loopback of CE2.
The Solution? - BGP SoO
Since the AS-Path can no longer be used for Loop prevention. The next step that can be made is to assign each route a community tag that allows BGP to identify if a prefix came from a particular site. By tagging the route, BGP will now complete an extra step before advertising the route on. It will check if the peers site of origin is listed in the community field. If it is then the route is filtered. If not then the route is advertised as normal.
This can be verified by showing what the PE routers are advertising to their peers. For example, before the SoO implementation the loopback was advertised back into the originating AS.
All that is needed to configure a unique SoO code for each multihomed site on the PE routers. Each site should have a unique site string value to ensure that the routes are propagated. This is only the case if BGP is used as the PE-CE protocol at more than one site. For this demonstration I assigned the site a value of 65535:1
Note: Remember to enable the sending of extended communities for IOS and IOS-XE platforms.
The result of this is that the route is no longer advertised back into the AS. Therefore causing CE2 to prefer the IGP route to the destination.
In addition to this, we can see the tag has applied on the BGP update as it came in on PE1.
Now if there is a failure of the prefix, the network will route around it and/or re-converge if needed. This time, without causing a routing loop.
In this document we looked at how multi-homed MPLS L3 VPN sites that use BGP as the PE-CE protocol could cause a routing loop for the customer during particular failure scenarios. We also looked how this could be resolved using BGP SoO to tag the traffic from a site and prevent it from being re-advertised back in. Since BGP is a control-plane protocol, by preventing the advertisement we therefore prevent a loop in the data-plane.
For more details on BGP SoO configurations please see - http://www.cisco.com/c/en/us/td/docs/ios/12_4t/12_4t11/htbgpsoo.html
Hope This Helps,