A Network Designer’s Thought Process Part 1 by Cary ChenA network designer’s goal is to create a design to suit a business need. To achieve this goal, a network designer goes through a thought process that is similar to the one outlined in the CCDE Practical Exam Topics. In this blog I'll go over some simplified high-level tasks of a network designer:

 

  • Analyze design requirements – Analyze and understand the business goals that a network should support, the business requirements, conditions and constraints, and translate the business requirements into technical requirements. Analyze and understand technical requirements, conditions and constraints including the existing network of which the new design will be built on top. After all, in most of the cases, a network designer is dealing with some existing networks. Many network projects are network expansion, upgrade, migration, merger, or divestiture. Only occasionally does one have a chance to design a new network from scratch.
  • Develop network designs – With all the goals, requirements, conditions and constraints, a designer needs to come up with some feasible design options to meet the goals and other criteria. Finally, a designer will need to select the most suitable design to achieve the goals.
  • Implement network design – If the implementation of a design will cause unacceptable business interruption, it’s not a good design. A designer has to consider the implementation plan when designing a network.
  • Validate and optimize network design – A design should be validated before being fully deployed in a production environment. It could be validated in a simulation, a lab environment, or in a controlled production environment. If the design doesn’t meet the design goals completely, then it should be optimized. A network designer doesn’t necessarily perform the validation him or herself, but the designer should define the validation strategy and success criteria.

 

A simplified design example: A new regional ISP would like to build an IP backbone network connecting its POPs

Let’s use a simplified example to illustrate the above network designer’s thought process. This is only a small portion of a complete network design rather than a complete network design. I’ll only focus on the ‘Analyze design requirements’ and ‘Develop network designs’ tasks in this article. Let’s look at the following scenario: A network designer is given the following information and requirements and is tasked to create a single area, or multi-area OSPF area design.

 

Here is some information:

  • This network will provide MPLS, VPN, and Internet services to business customers.
  • OSPF has been chosen as the IGP. All P routers, PE routers, and some other control plane routers such as BGP RRs are included in the OSPF domain.
  • This network is designed to run MPLS with a BGP free core, that is, BGP is not running on any of the P routers.
  • MPLS TE will be deployed to route traffic to selected paths for certain applications, which have specific bandwidth, latency, or redundancy requirements.
  • In the first year of the network life-cycle, there will be about 50 routers in the OSPF domain, and the number will grow over the next few years, to no more than a 200 routers based on a projection.

 

Here are the requirements:

  • The network has to be highly available (no less than 99.999% availability).
  • Minimize traffic latency whenever physical path allows.
  • Traffic converges in less than 100 milliseconds between any two PE routers in the OSPF domain when any link or node failure occurs.

 

Analyze the info and design requirements:

Some design requirements are explicit. Some are not so obvious. If I were the network designer, I would go through the following thought process, start with the explicit ones.

 

  • 99.999% of network availability – It means that considerable node and link redundancy are required from the network topology perspective. I’ll analyze the physical network topology design and estimate the size and scale of the network from OSPF perspective with the consideration of future growth. Try to estimate the number of OSPF LSAs. More nodes lead to more OSPF LSAs and bigger SPF tree, assuming most of the links are point-to-point links. More links means potentially more link failures per day, although the modern ISP network is much more stable than a decade ago. More failures mean more LSA updates. Denser links and nodes require a more powerful router CPU to process LSAs and SPF computations. Larger RIB and FIB tables also require more router memory. An OSPF network with 200 nodes will typically have no more than a few thousands of OSPF internal routes. The number of BGP routes (either Internet routes, or VPN routes, or both) is generally much higher than the OSPF in a PE router.  I’ll need to verify that the purchased routers have sufficient memory to store the RIB and FIB tables, and have sufficient CPU power to process OSPF and other control protocols like BGP. Modern routers targeted to the ISP core network market are typically very powerful and scalable. I wasn’t given the info of the purchased routers, such as vendor, model, and memory size. I’ll have to find out so I can verify whether the routers can actually support a big single area with large number of LSAs without route summarization. Or, do I have to use multi-area to reduce the number of LSAs, topology and routing info. It’s part of network designer’s job to ask the right questions, and obtain the necessary info. Let’s say the purchased routers are some popular ISP routing platforms, and I verify that the routers can support a big single area. When verifying the router memory size, I would check both the route processor memory and the line card memory because the FIB table resides in line card as well as the route processor.
  • Minimize traffic latency whenever physical path allows - Route summarization could cause loss of routing information and sub-optimal routing when redundant paths exist. Sometimes sub-optimal routing could route traffic to a longer path, causing longer traffic latency and traffic congestion. A single area in OSPF doesn’t summarize routing info. Every OSPF router has the same and full topology view of the whole OSPF domain. There is no sub-optimal routing due to loss of topology and routing info in a single area design. Multi-area design is the opposite if route summarization is deployed. Careful route summarization design could address the concern, though.
  • Traffic converges in less than 100 milliseconds between any two PE routers in the OSPF domain when any link or node failure occurs – This is a PE router to PE router (end to end from the OSPF domain perspective) convergence requirement. Some questions come to my mind: what applications require such convergence time? Is this required for MPLS VPN traffic or Internet traffic? Is this a requirement for unicast traffic only or is multicast traffic included? Is it achievable? The full fast convergence design is out of my OSPF area design scope but I need to know how my OSPF area design will affect the fast convergence. I’ll ask those questions and have a better understanding of the requirement. Tuning the OSPF LSA and SPF timers alone won’t be able to achieve this level of very fast network convergence. Some sort of data plane FRR feature(s) is/are required. It could be MPLS FRR or IP FRR. Since MPLS TE will be deployed, MPLS FRR seems a logical choice. I will ask if MPLS FRR is part of the fast convergence design. It’s simpler for MPLS TE and FRR to work in a single area than multi-area.

 

The implicit requirements

There are some implicit requirements, or some items that may be affected by my OSPF design or that may affect my design, which I'm analyzing next.

  • MPLS TE to route traffic to certain paths – MPLS TE in a single area is much simpler to deploy, and with less limitations than an inter-area TE.
  • Growth projection up to 200 routers in the OSPF domain - The growth has been considered in my OSPF size and scale estimate.
  • BGP free core – It means P routers do not need to run the BGP process, not to store BGP routes and forwarding entries. So P routers do not need as much memory as the PE routers. It will not negatively impact my design or vice versa.

 

Ok, I presented the design challenge and my preliminary analysis in this blog. What would be your approach to this design task? In my next blog I’ll explain how I would approach it and my final design solution.

 

About the Author

Blog13 and 14- Cary Chen.png

 

Cary Chen is CCDE #20130038 and double CCIE #14263 (SP and SP Operations), and a manager in Cisco Advanced Services. Cary supported ISP customers, and developed ISP routing platforms in the last 15 years. He enjoys working with ISP customers to design next generation networks.

 

 

 

Here are a few additional ways for us to engage and keep the conversation going: