The Border Gateway Protocol (BGP) is an exterior gateway protocol (EGP) also known as inter-Autonomous System routing protocol. The primary function of a BGP speaking system is to exchange network reachability information with other BGP systems. This network reachability information includes information on the list of Autonomous Systems (ASs) that reachability information traverses. This information is sufficient to construct a graph of AS connectivity from which routing loops may be pruned and some policy decisions at the AS level may be enforced. BGP-4 provides a new set of mechanisms for supporting classless interdomain routing. These mechanisms include support for advertising an IP prefix and eliminate the concept of network "class" within BGP. BGP-4 also introduces mechanisms, which allow aggregation of routes, including aggregation of AS paths. BGP also performs interdomain routing in TCP/IP networks.
Problems in BGP
I-BGP Scaling Problem:
All BGP speakers within a single AS must be fully meshed so that any external routing information must be re-distributed to all other routers within that AS. This "full mesh" requirement clearly does not scale when there are a large number of IBGP speakers as is common in many of todays internet networks. For n BGP speakers within an AS you must maintain n*(n-1)/2 unique IBGP sessions. This large number of connections makes resource intensive authentication and encryption a practical impossibility, and further leads to slow BGP convergence.
Here we provide an overview of the three possible solutions for the I-BGP scalability problem-
a) Route Reflection: The route reflection is based on hierarchical approach. It consists of Route Reflectors (RR) & Route Reflector Clients (RRC). RRC depends on RR in learning and advertising routes. RRs are connected in a full mesh. RRs do not readvertise prefixes between non-clients. Two more attributes are added which are, ORIGINATOR_ID(9) and CLUSTER_LIST(10). It allows a router (route reflector – RR) to advertise routes received from an iBGP peer to other iBGP peers between clients and from clients to non-clients, and vice versa. The basic purpose of ORIGINATOR_ID and CLUSTER_LIST attributes is to perform loop detection. This provides a scalable alternative to an iBGP full mesh[1].
b) AS Confederation: This technique is based on Divide and conquer paradigm that is, divide an AS into sub-Ass and within a sub-AS use full mesh I-BGP and between sub-Ass uses full mesh E-BGP. Here we subdivide the autonomous systems with a very large number of BGP speakers into smaller domains for purposes of controlling routing policy via information contained in the BGP AS_PATH attribute. Subdividing a large autonomous system allows a significant reduction in the total number of intra-domain BGP connections, as the connectivity requirements simplify to the model used for inter-domain connections and thus help in avoiding I-BGP full mesh[2].
c) Virtual Peering: Virtual peering is used to reduce the overhead and management complexity of maintaining numerous direct BGP/IDRP sessions which otherwise might be required or desired among routers within a single routing domain as well as among routers in different domains that are connected to a common switched fabric (e.g. an ATM cloud). It proposes to use IDRP/BGP Route Servers, which would relay external routes with all of their attributes between client routers. The clients would maintain IDRP/BGP sessions only with the assigned route servers (sessions with more than one server would be needed if redundancy is desired). The Route Server would propagate all routes that are received from a client router to other clients. Since all external routes and their attributes are relayed unmodified between the client routers, the client routers would acquire the same routing information as they would via direct peering. Therefore, this arrangement is referred to as virtual peering[3].
Instability Issue (Route Flapping):
A Route Flap may be described as constant up and down of a link. A widely deployed BGP implementation may tend to fail due to high routing update volume of the advertised reachability of a subset of Internet prefixes. Two methods of controlling the frequency of route advertisement are described here. The first method involves fixed timers. The fixed timer technique has no space overhead per route but has the disadvantage of slowing route convergence for the normal case where a route does not have a history of instability. The second method overcomes this limitation at the expense of maintaining some additional space overhead. The additional overhead includes a small amount of state per route and a very small processing overhead.
It is possible and desirable to combine both techniques. In practice, fixed timers have been set to very short time intervals and have proven useful to pack routes into a smaller number of updates when routes arrive in separate updates. Following are it’s effects: increased packet loss, increased network latency, CPU overhead and loss of connectivity[4].
Multihoming Problem in BGP:
The longest prefix match routing technique introduced by CIDR and implemented in BGP when combined with provider address allocation is an obstacle to effective multi-homing where load sharing across the multiple links is required: If an AS has been allocated its addresses from an upstream provider, the upstream provider can aggregate those addresses with those of other customers and need only advertise a single prefix for a range of customers. But if the customer AS is also connected to another provider, the second provider is not able to aggregate the customer addresses because they are not taken from his allocation, and will therefore have to announce a more specific route to the customer AS. The longest match rule will then direct all traffic through the second provider. Small networks multi-homing with a number of peers and a number of upstream providers has lead to an enormous growth of the BGP tables[5].
Persistent protocol oscillation is caused by the combination of following two reasons:
• Only comparing the MULTI_EXIT_DISC between routes learned from the same neighboring AS.
• The reduction of the iBGP full mesh results in the fact that not all the BGP speakers in the AS have complete visibility of the available exit points into a neighboring AS.
The visibility may be partial and inconsistent depending on the location (and function) of the router in the AS. In certain topologies involving either route reflectors or confederations, the partial visibility of the available exit points into a neighboring AS may result in an inconsistent best path selection decision as the routers don't have all the relevant information. If the inconsistencies span more than one peering router, they may result in a persistent route oscillation.
The current specification of BGP-4 states that the MULTI_EXIT_DISC is only comparable between routes learned from the same neighboring AS. In a full mesh iBGP network, all the internal routers have complete visibility of the available exit points into a neighboring AS[8].
Complex Policy Control:
Currently, BGP policy has several classes of issues-
• Policies were installed in an ad-hoc manner in each autonomous system. There is no method for ensuring that the policy installed in one router is coherent with policies installed in other routers[5].
• There is a possibility of installing policies in routers that may cause routing loops and may never converge in certain types of topology.
• There is no available network model for describing policy in a coherent manner.
Policy management is extremely complex and mostly done without the aid of any automated procedure. The extreme complexity means that highly qualified specialists are required for policy management of border routers.
BGP Community Attributes can be used to simplify configuration of complex routing policies to some extent, which provides categorization of routes.
Configuration Problems:
When a new BGP peer is added, every other peer in the network must be reconfigured with the new peer’s ID and the new peer must be configured with the Ids of all existing peers a tedious, error-prone task that contributes to unreliability
BGP security:
Under the present BGP design, cryptographic authentication of the peer-peer communication is not mandated. As a TCP/IP protocol, BGP is subject to all the TCP/IP attacks, like IP spoofing, session stealing, etc. Any outsider can inject believable BGP messages into the communication between BGP peers and thereby inject bogus routing information or break the peer-to-peer connection. Any break in the peer-to-peer communication has a ripple effect on routing that can be wide spread. Furthermore, outsider sources can also disrupt communications between BGP peers by breaking their TCP connection with spoofed RST packets [10]
Two different protection against spoofing in the peer-peer connection were suggested-
IP level protection
Protection at network level can be used to provide connectionless integrity, data origin authentication, and a anti-replay service by using IPSec.
TCP level protection
MD5 can be used for TCP level protection. Due to collisions in MD5, IPSEC protections mandate the use of HMAC-MD5. The TCP sequence numbers provide some protection against replay. But as some packets, notably a RST packet, need only be within the receive window to be accepted, the TCP sequence number protection is not complete. Also MD5 has no provisions for multiple keys to be used in rekeying. As these are pair wise keys used for long-lived sessions, the inability to specify multiple keys may not cause operational difficulties. Although the TCP level protection has deficiencies when compared with the protection of IPSEC [7], it is vastly preferable to a unprotected connection.
Secure-BGP (S-BGP) is being proposed to enhance the security of BGP by verifying the authenticity and authorization of the BGP control traffic.
Multiprotocol Extensions:
Extensions to BGP-4 should enable it to carry routing information for multiple Network Layer protocols (e.g., IPv6, IPX, etc...) and must be backward compatible i.e. a router that supports the extensions can interoperate with a router that doesn't support the extensions. Hence, to enable BGP-4 to support routing for multiple Network Layer protocols the only two things that have to be added to BGP-4 are-
• the ability to associate a particular Network Layer protocol with the next hop information, and
• the ability to associated a particular Network Layer protocol with NLRI.
To provide backward compatibility, as well as to simplify introduction of the multiprotocol capabilities into BGP-4 RFC 2283 uses two new attributes, Multiprotocol Reachable NLRI (MP_REACH_NLRI), and Multiprotocol Unreachable NLRI (MP_UNREACH_NLRI). The first one (MP_REACH_NLRI) is used to carry the set of reachable destinations together with the next hop information to be used for forwarding to these destinations. The second one (MP_UNREACH_NLRI) is used to carry the set of unreachable destinations. Both of these attributes are optional and non- transitive. This way a BGP speaker that doesn't support the multiprotocol capabilities will just ignore the information carried in these attributes, and will not pass it to other BGP speakers [11].
Table Growth:
Multihoming, load balancing, address fragmentation, and failure to aggregate address prefixes contributes the most of the routing table size[14]. Measurements show that the rate of growth of routes and route instances in the default-free table have resumed exponential growth, which were slowed to linear growth after the introduction of CIDR.
Projections of the average prefix length of advertisements using current trends in the number of BGP table entries and the total address span advertised in the BGP table indicate lookups need to search deeper through the prefix chain to find the necessary forwarding entry, requiring faster memory subsystems to perform each lookup, or the lookup table needs to be both larger and more sparsely populated, increasing the requirements for high speed memory within the router’s forwarding subsystem [13].
BGP Convergence:
Routing policy conflicts with BGP has a possibility of leading the protocol to diverge. That is, such inconsistencies could cause a collection of ASes to exchange BGP routing messages indefinitely without ever converging on a set of stable routes. While pure distance-vector protocols such as RIP [8] are guaranteed to converge, the same is not true for BGP. Indeed, results were shown that there are routing policies that can cause BGP to diverge. BGP divergence could introduce a large amount of instability into the global routing system. However, it is difficult to find any instance where routing instability has been caused by protocol divergence, and it is impossible to say if divergent BGP systems will arise in practice. On the other hand, given the economic importance of the Internet, it is believe that it is worthwhile to consider worst-case scenarios and to provide safeguards where possible.
Broadly speaking, the BGP convergence problem can be addressed either dynamically or statically. A dynamic solution to the BGP divergence problem is a mechanism to suppress or completely prevent at "run time" those BGP oscillations that arise from policy conflicts. Using route flap dampening as a dynamic mechanism to address the BGP convergence problem has two distinct drawbacks.
Route flap dampening cannot eliminate BGP protocol oscillations; it will only make these oscillations run in "slow motion".
Route flapping events do not provide network administrators with enough information to identify the source of the route flapping.
A static solution is one that relies on programs to analyze routing policies to verify that they do not contain policy conflicts that could lead to protocol divergence [16]
AS Number Exhaustion:
Each network that is multi-homed within the topology of the Internet and wishes to express a distinct external routing policy must use an AS to associate its advertised addresses with such a policy. In general, each network is associated with a single AS, and the number of AS’s in the default-free routing table tracks the number of entities that have unique routing policies. The trend of AS number deployment over the past four years was found exponential. The growth in the number of AS's can be correlated with the growth in the amount of address space spanned by the BGP routing table. Each AS is advertising smaller average address spans per AS. This points to increasingly finer levels of routing detail being announced into the global routing domain, a trend that causes some level of concern. If this rate of growth continues, the 16 bits AS number set were projected to be exhausted by late-2005. Work is underway within the IETF to modify the BGP protocol to carry AS numbers in a 32-bit field [13].
Suggestions
In order to avoid I-BGP full mesh, flooding methodology familiar in OSPF and IS-IS can be used as a new basis for the new transport method. Some of the issues with BGP route convergence can be addressed by implementing some kind of message synchronization in the next version of the protocol. This would help to avoid the back-and-forth rounds of update information exchanged after a route failure and limit exchanges to one round. Since the size of the routing table is of great concern, filters can be placed to drop the routes that are too host specific. These filters are expressed as a function of the length of the address prefix, such the network that is smaller than a /24 is not accepted. The actual limit may vary from network to network, and also over time.
References
[1]. BGP Route Reflection- An alternative to full mesh IBGP (RFC 1966)
[2]. Autonomous System Confederations for BGP (INTERNET DRAFT) - draft-ietf-idr-bgp-confed-rfc1965bis-01.txt
[3]. A BGP/IDRP Route Server alternative to a full mesh routing (RFC 1863)
[4]. BGP Route Flap Damping (RFC 2439)
[5]. Analysis of Current Inter-domain routing policies - Young Jiang, Telia Research
[6]. Border Gateway Protocol (BGP) Persistent Route Oscillation Condition (RFC 3345)
[7]. BGP Scalability and Troubleshooting - Cisco.com
[8]. BGP persistent route oscillation solution - draft-walton-bgp-route-oscillation-stop-00
[9]. BST Protocol - BGP Scalable Transport, Packet Design
[10]. BGP Security Vulnerability Analysis (INTERNET DRAFT) - draft-murphy-bgp-vul-01.txt- Sandra Murphy, NAI Labs
[11]. Multiprotocol Extensions for BGP-4 (RFC 2283)
[12]. http://www.research.att.com/~griffin/bgpresearch.html Timothy G. Griffin, AT&T labs
[13]. Analyzing the Internet’s BGP Routing Table http://macross.dynodns.net/idr/4-1-bgp.pdf
[14]. On Characterizing BGP Routing Table Growth http://www-unix.ecs.umass.edu/~lgao/globalinternet2002_tian.pdf [15]. BGP Communities Attribute (RFC 1997)
[16]. An Analysis of BGP Convergence Properties - Timothy G. Griffin and Gordon Wilfong- Bell Laboratories, Lucent Technologies