Posts Tagged ‘load-balancing’
The problem of unequal-cost load-balancing
Have you ever wondered why among all IGPs only EIGRP supports unequal-cost load balancing (UCLB)? Is there any special reason why only EIGRP supports this feature? Apparently, there is. Let’s start with the basic idea of equal-cost load-balancing (ECLB). This one is simple: if there are multiple paths to the same destination with equal costs, it is reasonable to use them all and share traffic equally among the paths. Alternate paths are guaranteed to be loop-free, as they are “symmetric” with respect to cost to the primary path. If we there are multiple paths of unequal cost, the same idea could not be applied easily. For example, consider the figure below:
Suppose there is a destination behind R2 that R1 routes to. There are two paths to reach R2 from R1: one is directly to R2, and another via R3. The cost of the primary path is 10 and the cost of the secondary path is 120. Intuitively, it would make sense to start sending traffic across both paths, in proportion 12:1 to make the most use of the network. However, if R3 implements the same idea of unequal cost load balancing, we’ve got a problem. The primary path to reach R2 heading from R3 is via R1. Thus, some of the packets that R1 sends to R2 via R3 will be routed back to R1. This is the core problem of UCLB: some secondary paths may result in routing loops, as a node on the path may prefer to route back to the origin.
GLBP, an acronym for Gateway Load Balancing Protocol, is a virtual gateway protocol similar to HSRP and VRRP. However, unlike its little brothers, GLBP is capable of using multiple physical gateways at the same time. As we know, a single HSRP or VRRP group represents one virtual gateway, with single virtual IP and MAC addresses. Only one physical gateway in a standby/redundancy group is responsible for packet forwarding, others remain inactive in standby/backup state. If you have R1, R2, R3 sharing the segment 174.X.123.0/24 with the physical IP addresses 174.X.123.1, 174.X.123.2 and 174.X.123.3 you may configure them to represent one single virtual gateway with an IP address 174.X.123.254. The physical gateway priority settings will determine which physical gateway takes the role of the active packet forwarder. The hosts on the segment will set their default gateway to 174.X.123.254, staying isolated of the physical gateway failures.
GLBP brings this idea to new level, by allowing multiple physical gateways to participate in packet forwarding simultaneously. Considering the example above, imagine you need the hosts on the segments to fully utilize all existing physical gateways, yet provide recovery from a gateway failure. For instance, you may want 50% of outgoing packets to be sent up to R1, 30% to R2 and 20% to R3. At the same time, you want to ensure, that hosts using either of the gateways will automatically switch to another if their gateway fails. On top of that, all hosts in the segment should reference to the virtual gateway using the same IP address 174.X.123.254. This is a complicated task, which has being addressed by GLBP protocol design.
To begin with, we should recall that every host on the segment needs to resolve the virtual gateway IP address 174.X.123.254 to the MAC address using ARP protocol. When we use HSRP or VRRP, the ARP response will be the virtual MAC addresses, which is assigned to the active physical gateway. At this point, GLBP responds with different virtual MAC addresses of the physical gateways in the GLBP group. So the key idea with GLBP is accomplishing load-balancing by responding to ARP requests with different virtual MAC addresses.
Here are the implementation details. One of the routers in a GLBP group is elected as an AVG – Active Virtual Gateway. There is only one active AVG in a group, and its task is to respond to ARP requests sent to the virtual gateway IP address (e.g. 174.X.123.254) replying different virtual MAC addresses in response packets. The AVG will also implement load-sharing algorithm, e.g. by sending the replies in proportion to weights configured for physical gateways. Aside from AVG, the other logical component of GLBP implementation is AVF – Active Virtual Forwarder. Any physical gateway in a GLBP group may act as AVF – in fact all physical gateways are usually AVFs. Every AVF has a virtual MAC address assigned by an AVG and a weight value configured by an operator.
Now let’s discuss redundancy – the primary goal of any virtual router protocol. There are two logical entities used to build a GLBP group: AVGs and AVFs, and each of them needs a backup scheme. Since there is just one AVG per a GLBP group, the procedure is pretty simple: each candidate AVG has a priority value assigned; the highest priority router becomes an active AVG, the next by priority becomes a standby AVG. You may configure AVG preemption, so that a newly configured router with highest priority value becomes AVG, preemption the old one.
What about AVF redundancy? First, we need to understand that AVFs are always “active” in the sense that they are always used by a load-balancing algorithm. (However, by setting an AVG weight value below threshold, we may effectively take the AVF out of service. The weight value could be combined with object tracking to bring powerful traffic manipulation options). Next, with respect to redundancy, all AVFs backup each other. For instance, take any AVF: with respect to the other AVFs it is “Active”, and the remaining AVFs are in “Listen” state. If the AVF would fail, other gatewyas will detect the event using Hold timer expiration, and immediately try to take over the failed AVF virtual MAC address. Among the competitors, the AVF with highest weight value would win, and the remaining AVFs will switch back to “Listen” state. At this point, the “winner” will start accepting packets for two virtual MAC addresses: it’s own, and the one it has obtained from the failed AVF. At the same moment, two timers would start: Redirect and Secondary Hold. The Redirect timer determines how long will AVG continue to respond to ARP requests with the virtual MAC of the failed AVF. The Secondary Hold timer sets the amount of time the backup AVF will continue to accept packet for the virtual MAC address taken from the failed AVF.
This is basically how GLBP works. Different load-balancing algorithms are supported – the default one is round robin, with options for weighted load balancing and source-MAC based. The last one will always respond with the same vMAC to the same source MAC address, thereby defining sort of host-gateway “stickiness”. Now for a sample GLBP configuration, for the above mentioned R1, R2 and R3:
! ! We set load-balancing to weighted only on R1 ! So if R2 will become the AVG, it will use round-robin ! load-balancing technique ! R1: interface FastEthernet0/0 ip address 126.96.36.199 255.255.255.0 glbp 123 ip 188.8.131.52 glbp 123 preempt glbp 123 weighting 50 glbp 123 load-balancing weighted ! ! ! R2: interface FastEthernet0/0 ip address 184.108.40.206 255.255.255.0 glbp 123 ip 220.127.116.11 glbp 123 priority 50 glbp 123 preempt glbp 123 weighting 30 ! ! ! R3: interface Ethernet0/0 ip address 18.104.22.168 255.255.255.0 glbp 123 ip 22.214.171.124 glbp 123 priority 25 glbp 123 preempt glbp 123 weighting 20
Some show output:
Rack1R1#show glbp FastEthernet0/0 - Group 123 State is Active 2 state changes, last state change 03:12:05 Virtual IP address is 126.96.36.199 Hello time 3 sec, hold time 10 sec Next hello sent in 0.916 secs Redirect time 600 sec, forwarder time-out 14400 sec Preemption enabled, min delay 0 sec Active is local Standby is 188.8.131.52, priority 50 (expires in 8.936 sec) <-- Standby AVG Priority 100 (default) Weighting 50 (configured 50), thresholds: lower 1, upper 50 <-- <-- Should the weight go below thresh, AVF is taken offline Load balancing: weighted Group members: ca00.0156.0000 (184.108.40.206) local <-- Hardware MACs ca01.0156.0000 (220.127.116.11) cc02.0156.0000 (18.104.22.168) There are 3 forwarders (1 active) Forwarder 1 State is Listen <-- All other AVFs Listen to us MAC address is 0007.b400.7b01 (learnt) <-- Virtual MAC Owner ID is ca01.0156.0000 <-- This is R2 Redirection enabled, 598.928 sec remaining (maximum 600 sec) <-- <-- ARP replies with this vMAC are being sent by AVG Time to live: 14398.376 sec (maximum 14400 sec) Preemption enabled, min delay 30 sec Active is 22.214.171.124 (primary), weighting 30 (expires in 8.368 sec) <-- <-- The AVF reports it’s own IP as active Arp replies sent: 1 Forwarder 2 State is Active <-- Active mean it’s us 1 state change, last state change 03:12:45 MAC address is 0007.b400.7b02 (default) Owner ID is ca00.0156.0000 <-- R1 MAC address Redirection enabled Preemption enabled, min delay 30 sec Active is local, weighting 50 Arp replies sent: 1 Forwarder 3 State is Listen <-- All other AVFs Listen to us MAC address is 0007.b400.7b03 (learnt) Owner ID is cc02.0156.0000 <-- This is R3 Redirection enabled, 597.916 sec remaining (maximum 600 sec) Time to live: 14397.916 sec (maximum 14400 sec) Preemption enabled, min delay 30 sec Active is 126.96.36.199 (primary), weighting 20 (expires in 7.916 sec) Rack1R2#show glbp FastEthernet0/0 - Group 123 State is Standby 4 state changes, last state change 03:16:56 Virtual IP address is 188.8.131.52 Hello time 3 sec, hold time 10 sec Next hello sent in 0.236 secs Redirect time 600 sec, forwarder time-out 14400 sec Preemption enabled, min delay 0 sec Active is 184.108.40.206, priority 100 (expires in 9.148 sec) Standby is local <-- We are the standby AVG Priority 50 (configured) Weighting 30 (configured 30), thresholds: lower 1, upper 30 Load balancing: round-robin Group members: ca00.0156.0000 (220.127.116.11) ca01.0156.0000 (18.104.22.168) local cc02.0156.0000 (22.214.171.124) There are 3 forwarders (1 active) Forwarder 1 State is Active 1 state change, last state change 03:18:06 MAC address is 0007.b400.7b01 (default) Owner ID is ca01.0156.0000 <-- This is R2 Preemption enabled, min delay 30 sec Active is local, weighting 30 Forwarder 2 State is Listen MAC address is 0007.b400.7b02 (learnt) Owner ID is ca00.0156.0000 Time to live: 14398.644 sec (maximum 14400 sec) Preemption enabled, min delay 30 sec Active is 126.96.36.199 (primary), weighting 50 (expires in 8.636 sec) Forwarder 3 State is Listen MAC address is 0007.b400.7b03 (learnt) Owner ID is cc02.0156.0000 Time to live: 14399.260 sec (maximum 14400 sec) Preemption enabled, min delay 30 sec Active is 188.8.131.52 (primary), weighting 20 (expires in 9.260 sec)
Now let’s check how ARP redirection works:
Rack1SW1#ping 184.108.40.206 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 220.127.116.11, timeout is 2 seconds: ..!!! Success rate is 60 percent (3/5), round-trip min/avg/max = 8/12/16 ms Rack1SW1#sh ip arp Protocol Address Age (min) Hardware Addr Type Interface Internet 18.104.22.168 0 0007.b400.7b01 ARPA Vlan1 Internet 22.214.171.124 - cc06.0156.0000 ARPA Vlan1 Internet 126.96.36.199 0 ca01.0156.0000 ARPA Vlan1 Rack1SW1#clear arp-cache Rack1SW1#ping 188.8.131.52 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 184.108.40.206, timeout is 2 seconds: .!!!! Success rate is 80 percent (4/5), round-trip min/avg/max = 4/13/32 ms Rack1SW1#sh ip arp Protocol Address Age (min) Hardware Addr Type Interface Internet 220.127.116.11 0 0007.b400.7b02 ARPA Vlan1 Internet 18.104.22.168 - cc06.0156.0000 ARPA Vlan1 Internet 22.214.171.124 0 ca01.0156.0000 ARPA Vlan1 Repeat the above actions a few more times Rack1SW1#sh ip arp Protocol Address Age (min) Hardware Addr Type Interface Internet 126.96.36.199 0 0007.b400.7b03 ARPA Vlan1 Internet 188.8.131.52 - cc06.0156.0000 ARPA Vlan1 Internet 184.108.40.206 0 ca01.0156.0000 ARPA Vlan1 Internet 220.127.116.11 0 cc02.0156.0000 ARPA Vlan1
To summarize, GLBP is a virtual gateway protocol, with built-in load-balancing capabilities. Load balancing is based on manipulating ARP responses to the requests sent to the virtual gateway IP address. AVG role is used to load-balance and respond to ARP requests. AVF role manages one or more virtual MACs and is responsible for packet forwarding. AVG redundancy is controlled by GLBP priority and AVF redundancy is implemented using AVF weight value and two additional timers.