One of the most popular reasons why customers choose IWAN for their networks is the ability to intelligently use all available WAN paths. The simplest and most common way of splitting the load over the different outbound paths in PfRv3 is configuring different combinations of preferred and fallback paths for each explicitly specified class of traffic.
PfRv3 can have preferred, fallback and next-fallback paths setup. Each of these stances may have up to three different paths specified. To better understand it, here is how the setup may look like:
path-preference Path1 Path2 Path3 fallback Path4 Path5 Path6 next-fallback Path7 Path8 Path9
Let’s examine the following example. Please refer to the Figure 1.
Imagine a branch which has two MPLS circuits with different capacities (on MC/BR1 and BR2), one reliable Internet connection (on BR3) and one unreliable (WiFi or LTE) Internet connection (on BR4). Normally customers want to send their mission critical traffic over MPLS and lower priority traffic – over the Internet. Because the MPLS circuits are coming from the same provider they normally have the same SLA regarding the delay/jitter/loss, therefore it is not important which exact MPLS circuit will be chosen for the mission critical traffic, even if they have different capacities. Customers would like to ideally utilize both circuits proportionally. For the low-priority traffic, customers want to utilize the reliable Internet link. The unreliable Internet connection should only be used when all of the other links have failed or suffer service degradation – in terms of PfRv3, it is called Out of Policy (OOP).
Based on these requirements, a network designer would define the PfRv3 policy for mission critical traffic by specifying both of the MPLS paths as preferred, the reliable Internet path as fallback and the unreliable Internet path as next-fallback. The low-priority traffic would have preferred and fallback paths swapped. But how is PfRv3 going to manage sending mission critical traffic over the two available MLPS paths with different capacities?
With regular routing, sending the traffic over the multiple next-hops can be performed by equal-cost multi-path (ECMP) strategy supported by most routing protocols or by unequal cost strategy supported by EIGRP and BGP. In the context of this blog, the terms load-sharing and load-balancing are used interchangeably. On Cisco routers it is done by CEF and one of the methods is per-packet load-balancing which brings the risk of out of order packets. On most Cisco platforms CEF runs by default an algorithm based on source and destination IP hash and therefore the default unit which CEF works with for load-sharing is a conversation between a pair of IP addresses. It is possible to configure the router to use the Layer 4 information so it can distinguish even the TCP/UDP sessions. It is essential to understand that PfRv3 does not work at such a granular level. The minimal unit that PfRv3 works with is a traffic-class (TC) – a combination of the destination prefix and a DSCP marking. In other words, the TC aggregates all traffic marked with a specific DSCP value and sent from any source to any destination within a prefix learned by PfRv3.
So let’s get back to our example. Once the TCs for mission critical traffic are discovered by PfRv3 it will run a hash function but it also accounts for the link utilization. PfRv3 calculates and tries to equalize the bandwidth utilization based on percentages; therefore the links with unequal capacity will be loaded proportionally. This process is called load-sharing in PfRv3 and is enabled by default.
It is important to know that PfRv3 load-sharing is not a continuous process. Once the TC is assigned to one of the available paths, it will stay there until the path becomes unreachable or goes OOP. Please refer to Figure 2.
You may have a situation when a number of TCs are load-shared but after some time, the network communication will finish for the TCs sent over the 20Mbps link. So TCs 1, 4, 6 are eventually removed from PfRv3. In this case, the remaining TCs 2, 3, 5 routed over the 10Mbps link will remain on that link, and only new TCs (if any) will be assigned to the free path.
There is another important consideration. Let’s imagine there are a number of TCs load-shared between two MPLS links. Suddenly, one of the links goes down and all TCs are moved to the remaining working MPLS path, as shown on Figure 3.
When the failed link is restored, the TCs 1, 4, 6 will remain on a second MPLS link as they still satisfy the policy (where both MPLS links are specified as preferred). Again, only new TCs (if any) will be assigned to the free path.
The next question is – what happens to the traffic which is not explicitly specified in PfRv3 policy? That is the only traffic which is actually eligible for the PfRv3 load-balancing. This functionality is disabled by default. Once it is enabled, any discovered traffic-classes which do not have policy configured for it will be load-balanced over all available paths including MPLS and Internet. It is possible though to limit the load-balancing to the list of required paths and exclude the unreliable Internet connection from it. Please refer to the Figure 4.
The TCs 10-15 are the ones that do not have the policy defined and therefore are load-balanced. The biggest benefit of PfRv3 load-balancing is the fact that it is a continuous process. It tracks the utilization of each path and balances the traffic in such a way that the difference between bandwidth utilizations of the links does not surpass a pre-defined threshold which is 20% by default. Once breached, the load-balancing process can take the traffic-classes from a more loaded link and move them to a less loaded link. For example the TC 13 could be eventually moved to a reliable Internet path. Again, as PfRv3 works with link utilization measured in percent, all links are loaded with the traffic proportionally depending on its capacity.
As PfRv3 works only with traffic-classes, to achieve the required granularity at load-sharing and load-balancing, it may be necessary to adjust the length of site prefixes or use more DSCP values if possible.
At the end, based on design requirements you may have a network with IWAN routing that proportionally loads all reliable links and keeps the unreliable connections only as a last resort. Splitting the traffic to different paths can be done by manual enforcement (preferred/fallback/next-fallback paths for different TCs), load-sharing and load-balancing simultaneously. All functionalities are working very well with each other and even some limits of load-sharing can be compensated by automatically adjusting load-balancing. This blog concludes the IWAN design series; I hope they were helpful in demystifying IWAN and PfRv3 concepts.
About the Author
Dmytro (Dima) Muzychko is a principal design engineer at Verizon and mainly deals with non-standard customer requirements. His experience is built around service provider networking and is strongly focused on routing and switching, virtualization and lastly the SD-WAN. Dima's professional quote is "Intelligence is needed for solving problems; wisdom for avoiding them. Network design is all about the latter."
Dima currently holds a CCDE and a CCIE in Routing & Switching and a Masters in Computer Technologies.
Here are a few additional ways for us to engage and keep the conversation going:
- Cisco Learning Network CCDE Study Group
- Connect on LinkedIn: Dmytro Muzychko
- Connect on Twitter too
- CCDE study materials for the Written and Practical exams
- Related Unleashing CCDE blogs: IWAN Part 1: PfRv3 Design Considerations by Dmytro Muzychko, IWAN Part 2: Routing Rules by Dmytro Muzychko, IWAN Part 3: Microloops on IWAN networks by Dmytro Muzychko, Commercial Solutions for Classified (CSfC) with Joe Galimi, A Network Designer’s Thought Process Part 1 by Cary Chen, Network Function Virtualization in Enterprise by Stephen Lynn, All Smoke and Mirrors by Michael Kowal – Part 1
- Reference: In this blog I referenced an internal document written by Fyllon Papadopoulos.