Internet Routing Scalability Talk
Last week at RIPE 58 in Amsterdam I had to opportunity to present some content in the routing working group on work I’ve been doing with a past colleague of mine, Shane Amante (Level 3 Communications), and some of the routing research folks at UCLA. The slides I presented are available here (pdf) and I plan to present a variation of these at the upcoming NANOG 46 meeting in Philadelphia.
In a nutshell, the work has been focused on a few things, mostly related to Internet routing scalability and stability, in particular:
- growth in the number of unique routes (a prefix and it’s associated path attributes – i.e., a prefix and n alternative paths to reach that prefix from the local router) in the global routing system (as opposed to just the growth in the number of unique prefixes, or default free zone – DFZ size) and how continued inter-domain interconnection densesness effects this
- internal BGP architectures that result in additional [or fewer, rarely] paths (e.g., BGP route reflection has implicit aggregation effects but may also result in a considerable increase in paths and churn in within the local routing domain and propagating to external BGP peers)
- BGP protocol and implementation issues that result in either additional unique routes in the routing system, added churn, stability or reachability issues in the routing system, or other systemic effects that may appear benign on the surface but upon deeper analysis are actually much more problematic
- Qualifying and quantifying the effects of internal architectures and added paths on external routing system stability charascteristics
The conceptual router diagram below illustrates various Routing Information Base (RIB) inputs from different processes in a given Internet router, and some Routing Table Manager (RTM) function that handles inputs from the various routing processes.
The RTM function typically applies preferences based on default or configured policies (e.g., static and connected routes would have preference over dynamically learned routes such as those from an external BGP peer). From there, the RTM generates the IP RIB, and it’s abstracted and augmented with some Link Layer and other information to generate the Forwarding Information Base (FIB). FIB sizes and global Internet routing tables (i.e., DFZ ++internal_routes) often go hand in hand, and typically range today anywhere from 275k to 350k, although in some networks the number of internal routes is much larger, and it may even be North of 500k (in particular with larger merged or legacy networks with less than optimal internal route aggregation schemas). The number of paths in most reasonably-sized core BGP networks today is much larger, often an order of magnitude or more (e.g., 2-6M paths representing those ~350k prefixes).
Given that most routers today model Type 3 switches (distributed control-driven forwarding fucntions) the FIB (and all _transit packet forwarding functions) are performed on each linecard and not by a central processor or switching engine – the FIB is distributed to each of a given set of line cards and each change is incrementally updated as it occurs. What this means is that a change in ANY BGP best path for a particular prefix (i.e., those 2-6M paths) must flow from the routing process, to the RTM, and then on to the forwarding engine memory on each of the line cards, something that usually requires multiple transactions at each stage, is obviously resource intensive, and consumes resources from the main CPU all the way down to forwarding engine I/O. As such, as the number of paths (i.e., unique routes) to reach a given prefix increases, you can expect more churn at all of these various stages in the system.
As illustrated in the Prefixes v. Unique Routes diagram above derived from information in Level(3)’s network over the past decade, the number of unique routes (paths) in the routing system is growing even more steeply than the number of unique prefixes alone, and this poses serious problems to the scalability and stability of the Internet routing system. Furthermore, [to tie this back to security!], in a secure routing system each path and route origin would likely need to be validated before it’s used for packet forwarding, so the growth in the sheer number of paths has significant implications on the ability to deploy secure inter-domain routing protocols (e.g., SBGP or soBGP) on the Internet as well.
The expanded deployment of IPv6 will be sure to increase the number of unique prefixes and unique paths in the system, as well as the introduction and carriage of other address families in BGP (e.g., IP/MPLS VPNs, IP v4 multicast, Multicast VPNs, and Pseudo Wire (PW) information). It’s important that operators, implementers and protocol designers alike keep a watchful eye on the various characteristics introduced by each of these employments of BGP.
There are lots more details in the slide deck above, an IMC paper submission whose publication should be forthcoming that I’ll be sure to provide reference to here, and a couple of Internet-Drafts to address some of the more mundane issues in the protocol or deployment models discussed.