Jun
30

Summer was in full swing, and it was over 105 degrees Fahrenheit outside.   Bob was told it was a “dry heat”, but he thought “so is my oven”.  Needless to say, Bob was glad to be in the data center, where the temperature and humidity controls kept it very cold.   He had been asked to setup up a basic route-map with BGP, and here is the diagram he worked from.

BGP Triangle
The goal, was to modify BGP,  so that all traffic going towards the 1.1.1.0 network (which is sourced from AS1), traveling either from or through AS23, would only use the 13.0.0.0/24 segment (between R3 and R1), and not use the 10.0.0.0/24 segment (between R2 and R1) as a transit path.
Bob reviewed some of the BGP topics he had recently learned.   Here is the list he made of possibilities:
R1 could pre-pend to the AS path for advertisements of the 1.1.1.0/24 prefix when it is sent to R2 from R1.   This way, AS23 would see a better path through R3 rather than R2.  He tried this using the following on R1:

ip prefix-list JUST-1.1.1.0 seq 5 permit 1.1.1.0/24

route-map PRE-PEND permit 10
 match ip address prefix-list JUST-1.1.1.0
 set as-path prepend 1
route-map PRE-PEND permit 20

router bgp 1
 neighbor 10.0.0.2 route-map PRE-PEND out

Bob cleared the BGP session, just to be sure.    Unfortunately, some traffic destined to 1.1.1.0 was still flowing over the 10.0.0.0 network between R2 and R1.

Bob decided to try another approach, and instead of R1 trying to make AS23 think the path on 10.0.0.0 was worse, perhaps he would tell R3 to make the path on 13.0.0.0 look better.    He considered weight, but then realized that would only work for R3, and not every other device in AS23.    So Bob decided to use local-preference.  On R3, he tried using local-preference, to specify that when a BGP update came in from R1 for 1.1.1.0, R3 would set the local-preference to 250 for that prefix, in hopes that this would allow traffic destined for 1.1.1.0 go exclusively through the 13.0.0.0 segment between R3 and R1.   Unfortunately, even with this change, Bob noticed that traffic destined to 1.1.1.0 from our through AS23 still crossed on the link between R2 and R1.

Below are the configurations for R1, R2 and R3 along with the relevant show commands.

Can you assist Bob?   What can he do?  What did he do wrong, if anything?

Post your ideas and comments below!

R1:

version 12.4
hostname R1
interface Loopback0
 ip address 1.1.1.1 255.255.255.0
 ip ospf network point-to-point

interface FastEthernet0/0
 ip address 10.0.0.1 255.255.255.0
 ip ospf 1 area 1

interface FastEthernet1/0
 ip address 13.0.0.1 255.255.255.0
 ip ospf 1 area 1

router bgp 1
 no synchronization
 bgp log-neighbor-changes
 network 1.1.1.0 mask 255.255.255.0
 neighbor 10.0.0.2 remote-as 23
 neighbor 10.0.0.2 route-map PRE-PEND out
 neighbor 13.0.0.3 remote-as 23
 no auto-summary

ip prefix-list JUST-1.1.1.0 seq 5 permit 1.1.1.0/24

route-map PRE-PEND permit 10
 match ip address prefix-list JUST-1.1.1.0
 set as-path prepend 1

route-map PRE-PEND permit 20

R2:

version 12.4
hostname R2
interface FastEthernet0/0
 ip address 10.0.0.2 255.255.255.0
 ip ospf 1 area 1

interface FastEthernet0/1
 ip address 23.0.0.2 255.255.255.0
 ip ospf 1 area 1

router bgp 23
 no synchronization
 bgp log-neighbor-changes
 neighbor 10.0.0.1 remote-as 1
 neighbor 23.0.0.3 remote-as 23
 no auto-summary
!

R3:

version 12.4
hostname R3
interface FastEthernet0/0
 ip address 13.0.0.3 255.255.255.0
 ip ospf 1 area 1

interface FastEthernet0/1
 ip address 23.0.0.3 255.255.255.0
 ip ospf 1 area 1

router bgp 23
 no synchronization
 bgp log-neighbor-changes
 neighbor 13.0.0.1 remote-as 1
 neighbor 13.0.0.1 route-map SET-LOCAL-PREF in
 neighbor 23.0.0.2 remote-as 23
 no auto-summary

ip prefix-list LOCAL-PREF-250 seq 5 permit 1.1.1.0/24

route-map SET-LOCAL-PREF permit 10
 match ip address prefix-list LOCAL-PREF-250
 set local-preference 250

route-map SET-LOCAL-PREF permit 20

Show commands R1:

R1#show ip bgp summary
BGP router identifier 1.1.1.1, local AS number 1
BGP table version is 2, main routing table version 2
1 network entries using 120 bytes of memory
1 path entries using 52 bytes of memory
2/1 BGP path/bestpath attribute entries using 248 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Bitfield cache entries: current 1 (at peak 2) using 32 bytes of memory
BGP using 452 total bytes of memory
BGP activity 2/1 prefixes, 2/1 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.2        4    23      77      73        2    0    0 00:29:01        0
13.0.0.3        4    23      74      74        2    0    0 00:29:01        0

R1#show ip bgp
BGP table version is 2, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.0/24       0.0.0.0                  0         32768 i

R1#show ip route | begin resort
Gateway of last resort is not set

1.0.0.0/24 is subnetted, 1 subnets
C       1.1.1.0 is directly connected, Loopback0
23.0.0.0/24 is subnetted, 1 subnets
O       23.0.0.0 [110/2] via 13.0.0.3, 00:48:43, FastEthernet1/0
                 [110/2] via 10.0.0.2, 00:48:09, FastEthernet0/0
10.0.0.0/24 is subnetted, 1 subnets
C       10.0.0.0 is directly connected, FastEthernet0/0
13.0.0.0/24 is subnetted, 1 subnets
C       13.0.0.0 is directly connected, FastEthernet1/0

Show commands R2:

R2#show ip bgp summary
BGP router identifier 2.2.2.2, local AS number 23
BGP table version is 14, main routing table version 14
1 network entries using 120 bytes of memory
2 path entries using 104 bytes of memory
3/1 BGP path/bestpath attribute entries using 372 bytes of memory
2 BGP AS-PATH entries using 48 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Bitfield cache entries: current 1 (at peak 2) using 32 bytes of memory
BGP using 676 total bytes of memory
BGP activity 1/0 prefixes, 4/2 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.0.0.1        4     1      73      77       14    0    0 00:29:07        1
23.0.0.3        4    23      71      73       14    0    0 01:04:54        1

R2#show ip bgp
BGP table version is 14, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*>i1.1.1.0/24       13.0.0.1                 0    250      0 1 i
*                   10.0.0.1                 0             0 1 1 i

R2#show ip route | begin resort
Gateway of last resort is not set

1.0.0.0/24 is subnetted, 1 subnets
B       1.1.1.0 [200/0] via 13.0.0.1, 00:28:37
2.0.0.0/24 is subnetted, 1 subnets
C       2.2.2.0 is directly connected, Loopback0
23.0.0.0/24 is subnetted, 1 subnets
C       23.0.0.0 is directly connected, FastEthernet0/1
10.0.0.0/24 is subnetted, 1 subnets
C       10.0.0.0 is directly connected, FastEthernet0/0
13.0.0.0/24 is subnetted, 1 subnets
O       13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
                 [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

Show commands R3:

R3#show ip bgp summary
BGP router identifier 3.3.3.3, local AS number 23
BGP table version is 6, main routing table version 6
1 network entries using 120 bytes of memory
1 path entries using 52 bytes of memory
3/1 BGP path/bestpath attribute entries using 372 bytes of memory
1 BGP AS-PATH entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Bitfield cache entries: current 1 (at peak 2) using 32 bytes of memory
BGP using 600 total bytes of memory
BGP activity 1/0 prefixes, 5/4 paths, scan interval 60 secs

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
13.0.0.1        4     1      74      74        6    0    0 00:29:09        1
23.0.0.2        4    23      73      71        6    0    0 01:04:56        0

R3#show ip bgp
BGP table version is 6, local router ID is 3.3.3.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*> 1.1.1.0/24       13.0.0.1                 0    250      0 1 i

R3#show ip route | begin resort
Gateway of last resort is not set

1.0.0.0/24 is subnetted, 1 subnets
B       1.1.1.0 [20/0] via 13.0.0.1, 00:28:39
3.0.0.0/24 is subnetted, 1 subnets
C       3.3.3.0 is directly connected, Loopback0
23.0.0.0/24 is subnetted, 1 subnets
C       23.0.0.0 is directly connected, FastEthernet0/1
10.0.0.0/24 is subnetted, 1 subnets
O       10.0.0.0 [110/2] via 23.0.0.2, 00:48:18, FastEthernet0/1
                 [110/2] via 13.0.0.1, 00:48:48, FastEthernet0/0
13.0.0.0/24 is subnetted, 1 subnets
C       13.0.0.0 is directly connected, FastEthernet0/0

Best wishes.

 

 

And the answer is:

Thanks to you, and your 50+ posts, bob got his answer.   By reading your responses, Bob learned the following:

For R2, the BGP next hop for the best route, is still 13.0.0.1, even though it is learned from R3.     R3 doesn’t bother to change the next-hop attribute when learning routes via a eBGP neighbor (R1).    With R2 having 2 equal cost paths (OSPF) for the next hop of 13.0.0.1, R2 would load balance the traffic over the 10.0.0.0 and 23.0.0.0 networks for traffic going to 1.1.1.0/24

One solution would be to have R3 use next-hop-self for updates sent to R2.  Then R2 would see the next hop as 23.0.0.3, and all the traffic would be forwarded to R3 as desired.

The update on R3 would look like this:

router bgp 23
 neighbor 23.0.0.2 next-hop-self

This would cause R2, to have the BGP table of this:

R2#show ip bgp
BGP table version is 4, local router ID is 2.2.2.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*>i1.1.1.0/24       23.0.0.3                 0    250      0 1 i
*                   10.0.0.1                 0             0 1 1 i

Another option would be increasing the OSPF cost on R2′s 10.0.0.0/24 interface, so that it wouldn’t be considered an equal cost to get to 13.0.0.1 (the previous next hop before the change we just made).

Thanks everyone for all your assistance!    You rock.


You can leave a response, or trackback from your own site.

56 Responses to “BGP Path Manipulation: Bob is at it again…”

 
  1. Alan Villegas says:

    The problem is with the nextop, it is reachable for R2 trougth OSPF and has 2 paths (R1 and R3). Nexthop-self in the BGP session R3-R2 could fix the issue.

    Regards

  2. Mark Lah says:

    The problem isn’t with Bob’s implementation of local pref and as-path prepending, but rather the fact that the iBGP session between R2 and R3 requires the “next-hop-self” command applied for everything to work.

    R2 has converged correctly, and the BGP table is correct. However, the next-hop for the best BGP route (from the iBGP session from R3) is the next-hop IP address of the R3 eBGP session to R1 (next hop of 13.0.0.1). Since OSPF is also being used to advertise routes, R2 uses the OSPF route to the 13.0.0.0/24 network to resolve the L2 next-hop address for the best BGP route, which is of course over the wrong path. Applying the “neighbor 23.0.0.x next-hop-self” command to R2 and R3 would resolve the problem.

  3. prakash says:

    This is the PROBLEM

    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

  4. prakash says:

    solution : on R2 change next hop to that route pointing to R3

  5. Sadek says:

    Hi Keith
    I think using pest path options like weight , Local-preference , AS path or MED , will make the 13.0.0.0 as prefered path , but will keep the 10.0.0.0 as a backup.
    so if you do not want to use 10.0.0.0 for this prefix forever , just do not advertise it to AS23 via 10.0.0.0.
    we can use route-map prefix-list or any other technique for this as out.
    prefix-list NET1 deny 1.1.1.0/24
    prefix-list NET1 permit 0.0.0.0/0 le 32
    and under BGP
    neighbor neighbor 10.0.0.2 prefix-list NET1 out

    thanks

  6. Hemanth Raj says:

    On R1,Configure higher med value for neigbour r2 and lower med value for neighbour router r3

    access-list 1 permit 1.1.1.0 0.0.0.255
    route-map R1toR2 permit 10
    match ip add 1
    set med 50
    route-map R1toR2 permit 20

    route-map R1toR3 permit 10
    match ip add
    set med 10
    route-map R1toR3 permit 20

    router bgp 1
    neighbor 10.0.0.2 route-map R1toR2
    neighbor 13.0.0.3 route-map R1toR3

    then traffic will correctly pass through R3 to R1 link

  7. hemanth raj says:

    ip prefix-list LOCAL-PREF-250 seq 5 permit 1.1.1.0/24

    I cant understnd this command wat do u mean by seq 5

  8. M. says:

    13.0.0.1 is load-balanced by R2 between R1 and R3, so:
    R3:
    router bgp 23
    neighbor 23.0.0.2 next-hop-self

  9. fpaulino says:

    hi keith,

    I think is because of the administrative distance, bob have configured ospf on the three routers so when he change the local pref to 250 in R3, R2 would prefer the route via ospf with an administrative distance of 110 since the route via bgp will change to 200 because it has learned it from ibgp. Is this correct??

  10. John says:

    Uhm… why are the ASs talking IGP?

    That’s a rhetorical question, “Bob”.

  11. Kristin says:

    My guess is that this is not achieving the desired result because R2 is doing a recursive lookup on 13.0.0.1, and is learning about 2 equal cost links (through R1 and R3) through OSPF. I believe you can use next-hop-self on R3 toward R2 to fix this issue.

  12. BET says:

    He need to make sure the route to 13.0.0.1 BGP’s next-hop address is reacheable through 23.0.0.3 not with both 23.0.0.3 and 10.0.0.1like the output in R2 below.

    13.0.0.0/24 is subnetted, 1 subnets
    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    He has to modify the cost for the OSPF network.

    Thanks

  13. The link between R1-R2 and R1-R3 must not run OSPF. The BGP next on hop on R2 is recursing to the OSPF route which is being loadbalanced to R1 and R3 and thus some traffic is using R1-R2 link.

  14. Joshua Reichmann says:

    Manipulate the opsf metric coming from R1 so that when R2 does a lookup on the BGP next hop value (13.0.0.1) there will be only one path through R3.

  15. minty says:

    The 3 routers are running ospf, so router R2 is learning the 13.0.0.0/24 network from ospf via 10.0.0.1. To negate this add the following to R2.

    router ospf 1
    distribute-list prefix ospf in FastEthernet0/0
    ip prefix-list ospf seq 5 deny 13.0.0.0/24
    ip prefix-list ospf seq 10 permit 0.0.0.0/0 le 32

    Router R2 then does not add the route to 13.0.0.0/24 via 10.0.0.1.

    R2#sh ip route | beg resort
    Gateway of last resort is not set

    1.0.0.0/24 is subnetted, 1 subnets
    B 1.1.1.0 [200/0] via 13.0.0.1, 00:05:51
    23.0.0.0/24 is subnetted, 1 subnets
    C 23.0.0.0 is directly connected, FastEthernet0/1
    10.0.0.0/24 is subnetted, 1 subnets
    C 10.0.0.0 is directly connected, FastEthernet0/0
    13.0.0.0/24 is subnetted, 1 subnets
    O 13.0.0.0 [110/20] via 23.0.0.3, 00:06:37, FastEthernet0/1

    If the link between R1 and R3 fails the traffic will go via the 10.0.0.0/8 network.

    Ian

  16. Andre says:

    Perhaps.. next-hop-self is currently missing on R3?

  17. GFC_CABRERO says:

    R2 was sending traffic using its link to R1, why ?

    R—2#sh ip bgp
    BGP table version is 3, local router ID is 10.0.0.2
    Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
    r RIB-failure, S Stale
    Origin codes: i – IGP, e – EGP, ? – incomplete
    Network Next Hop Metric LocPrf Weight Path
    *>i1.1.1.0/24 13.0.0.1 0 250 0 1 i
    * 10.0.0.1 0 0 1 1 i

    Preferred route to R3, awesome, but… lets see my route to R3:

    R—2#sh ip route 13.0.0.1
    Routing entry for 13.0.0.0/24
    Known via “ospf 1″, distance 110, metric 2, type intra area
    Last update from 10.0.0.1 on FastEthernet0/0, 00:18:32 ago
    Routing Descriptor Blocks:
    * 10.0.0.1, from 1.1.1.1, 00:18:32 ago, via FastEthernet0/0
    Route metric is 2, traffic share count is 1
    23.0.0.3, from 1.1.1.1, 00:18:32 ago, via FastEthernet1/0
    Route metric is 2, traffic share count is 1

    hummm…. thu R1 ?!?! well… OSPF ;)

    lets see the traceroute how it goes:

    R—2#traceroute 1.1.1.1
    Type escape sequence to abort.
    Tracing the route to 1.1.1.1
    1 10.0.0.1 68 msec
    23.0.0.3 188 msec
    10.0.0.1 92 msec

    yeah… i was using the link thru R1, and this is not good, so… lets change one think on BGP to fix this issue:

    on R3 i changed the following:

    R—3(config-router)#neighbor 23.0.0.2 next-hop-self << peering with R2, same AS

    NOW, all routes that R3 send to R2, R3 is the next hop, FUCK ospf :D

    R—2#sh ip route 1.1.1.1
    Routing entry for 1.1.1.0/24
    Known via "bgp 23", distance 200, metric 0
    Tag 1, type internal
    Last update from 23.0.0.3 00:00:37 ago
    Routing Descriptor Blocks:
    * 23.0.0.3, from 23.0.0.3, 00:00:37 ago
    Route metric is 0, traffic share count is 1
    AS Hops 1

    R—2#traceroute 1.1.1.1
    Type escape sequence to abort.
    Tracing the route to 1.1.1.1
    1 23.0.0.3 40 msec 96 msec 44 msec
    2 13.0.0.1 128 msec 188 msec *

    NOw, the link between R1 and R2 is not currently in use, am i right ? if not, at least i tried :D

  18. Peter Y says:

    My guess…

    While R2 has the correct next-hop IP of 13.0.0.1 (R3′s IP) for the 1.1.1.0/24 network, it’s learning the path to 13.0.0.1 via OSPF – which happens to have 2 equal cost paths to it – 1 of which is through R1.

    Hence, the traffic on the R1-R2 link.

  19. Leo says:

    Keith,

    I did not have a time to lab this up, but by looking to the outputs provided, I have some thoughts:

    Issue:
    The issue may be related to the OSPF route that R2 has on its routing table:

    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    Since the next-hop to reach 1.1.1.0/24 is 13.0.0.1, and R2 does a recursive lookup, it may be “load-sharing” between the two links and, thus, using both 13.0.0.0/24 and 10.0.0.0/24 segments

    Solution

    I may not be right here, but I think about two solutions:

    1) tweak the ospf cost on the R2 interface to a higher value, since OSPF does not load-share via unequal cost paths, this would remove the following entry from the routing table on R2:

    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    2) a fancier solution would be to use next-hop-self on R3 neighbor statement to R2, so R3 would update the next-hop value to 23.0.0.3. With this, R2 would recurse to its directly connected route going only to Fas0/1

    23.0.0.0/24 is subnetted, 1 subnets
    C 23.0.0.0 is directly connected, FastEthernet0/1

    Not sure if this is right. But I just tried, hope I got something right at least. It was very nice. Thank you very much for another good blog post.

    Kind Regards,
    Leo

  20. Prashant says:

    Hello,

    From R2 next hop for route 1.0.0.0/24 is 13.0.0.0.

    In routing table 13.0.0.0 is reachable tthrough OSPF with two equal path 23.0.0.3 , 10.0.0.1.

    Hence to reach destination 1.0.0.0/24 one packet towards 23.0.0.3 another 10.0.0.1 …. and so on ……

    Change(Increase) the OSPF cost on link between R2-R1 with reference-bandwidth command.

    Regards

  21. Michael Kartvelishvili says:

    Hello,

    The problem here is the next-hop recursion of BGP.
    It is sufficient to view “show ip route 13.0.0.1″ on R2. It will show, that traffic to the BGp next-hop is actually load-balanced over 10.0.0.1 and 23.0.0.3. So actually half of the overall traffic will pass through R1-R2 link.

    The solution is to either do a next-hop-self on R2 toward R3:
    !!!!!R2
    router bgp 23
    neighbor 23.0.0.3 next-hop-self

    or tune OSPF costs to make path through R1-R3 preferred (from OSPF point of view) for 13.0.0.0/24 destination:
    !!!!!R2
    interface fastethernet0/0
    ip ospf cost 100

    Best Regards,
    Michael

  22. Michael Kartvelishvili says:

    Sorry… I made a mistake in the first solution. It should look like:

    The solution is to either do a next-hop-self on R3 toward R2:
    !!!!!R3
    router bgp 23
    neighbor 23.0.0.2 next-hop-self

  23. Tarun says:

    The reason traffic is going though the link between R2 to R1 for the destination 1.1.1.0/24 is the next hop for the route is pointing towards 13.0.0.1 and to reach 13.0.0.1 it has two equal cost route in OSPF routing table. One is to R1 and second is for R3.

    Solution – Use Next-hop-self command on R3.

  24. Prashant says:

    Change(Increase) the OSPF cost on link between R2-R1 also set the reference-bandwidth.

    Regards

    Prashant

  25. Sergio says:

    R2 and R3 are in the same AS so they are IBGP peers.

    An iBGP peer by default do not change the Next-Hop of an Ebgp learn route.

    On R2, the route B 1.1.1.0 if going throught 13.0.0.1 and netwok 13.0.00/24 is known by IGP and is load shared via R1 and R3.

    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    To allow R3 to annonce 1.1.1.0/24 subnet with its own address as next-hop inside its own AS, add the next command on R3 router bgp section:

    router bgp 23
    neighbor 23.0.0.2 next-hop-self
    end

    R2 wil then route packets to 1.1.1.1 via R3 router.

    Serge.

  26. Gavin says:

    R3 doesn’t change the next-hop address of the BGP neighbor 13.0.0.1 when advertised to its iBGP neighbor R2 and as such BGP next-hop recursion for 13.0.0.1 is via the shortest path from R2 using the IGP or OSPF, across the fast0/0 link.

  27. Laurent says:

    The problem is that AS1 and AS23 are defined in the same OSPF area.

    R2 learns the path towards 1.1.1.0/24 via R3 (iBGP) and will prefer it over its eBGP path via R1 (AS-Path is shorter or local-pref is higher). Though, the BGP next-hop (NH) is 13.0.0.1 and the shortest-path towards this NH is via 10.0.0.1 (learned via OSPF).

    I see two solutions :

    1) Modify R3′s configuration to apply next-hop self policy. This will indeed force the traffic coming from R2 to go via R3 as the BGP NH will now be R3.

    2) Increase the OSPF metric on the direct link between R2 and R1 so that it is not preferred to reach 13.0.0.1

  28. JL says:

    Hi Keith

    I think we can increased the OSPF cost between R1 and R2 FE link so that on R2 Routing table , it will show only one path to the 13.0.0.0/24 subnet via 23.0.0.3. Currently when R2 do a route recursion lookup on the bgp next hop 13.0.0.1 , it sees 2 equal paths therefore some traffic destined for 1.1.1.0/24 traffic on the 10.0.0.0/24 network.

    Or we can configured “neighbor 23.0.0.2 next-hop-self ” on R3 under router bgp 23. So that when we do a show ip bgp on R2, it should be like below. When R2 do a lookup on the next hop , there is only one path via the directly connected interface.

    R2#show ip bgp
    BGP table version is 14, local router ID is 2.2.2.2
    Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
    r RIB-failure, S Stale
    Origin codes: i – IGP, e – EGP, ? – incomplete

    Network Next Hop Metric LocPrf Weight Path
    *>i1.1.1.0/24 23.0.0.3 0 250 0 1 i
    * 10.0.0.1 0 0 1 1 i

    Base on the IP routing table on R3, traffic from R3 should use the 13.0.0.0/24 network towards 1.1.1.0/24. We just need to ensure traffic from R2 or transiting AS 23 through R2 goes to R3.

    Please correct me if am wrong or miss out anything.

    Thanks

  29. Nikita says:

    next-hop-self on R3, so recurcive lookup for the bgp’s next-hop willn’t leads to the 13.0.0.1 (to where we have ecmp from ospf to that address)

  30. Bob Crawford says:

    According to the route table for R2, the next hop for 1.1.1.0 is 13.0.0.1

    At first glance, this SEEMS to meet the design goal, but . . . .

    When R2 does a recursive lookup for 13.0.0.1 it finds that OSPF is load balancing for 13.0.0.1 and sends some traffic to 23.0.0.3 and some to 10.0.0.1, thus defeating the design goal of having no traffic cross the 10.0.0.0/24 link.

  31. I think problem occures because of ibgp nexthop processing on R2.
    13.0.0.0/24 is subnetted, 1 subnets
    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    bgp next hop for 1.1.1.0/24 is known via 13.0.0.1.
    >i1.1.1.0/24 13.0.0.1 0 250 0 1 i
    So router does performing load balancing througth R1 & R2, R2 & R3.

    Bob can fix his issue by modifying ospf metric for the undesired link.

  32. BET says:

    if the only allowed modification is on bgp he can use

    Neighbor 23.0.0.2 next-hop-self command in R3 to make the bgp next hop IP address in R2 23.0.0.3 rather than 13.0.0.1 which R2 has two route to it through OSPF

    13.0.0.0/24 is subnetted, 1 subnets
    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    Thanks

  33. Rashid says:

    With current configurtion when R3 is passing the update for prefix 1.1.1.0/24 ,its not modifying the next hop(as it should be), and the next hop for prefix 1.1.1.0/24 on R2 shows 13.0.0.1 which in turn is being learned via ospf through 23.0.0.3 and 10.0.0.1 and hence the traffic on R1 R2 link.
    To fix this on R3

    R3
    router bgp 23
    neighbor 23.0.0.2 next-hop-self

  34. Melvin Reyes says:

    Easy..

    R2 prefer OSPF route because iBGP administrative distance is 200 and all we know OSPF is 110, so if you don’t want to modify the ospf configuration do the following.

    R2
    router bgp 23
    distance bgp 20 20 20

    And because R3 pass all prefix to R2 with the next-hop 13.0.0.1, you need to change to 23.0.0.3. do the following.

    router bgp 23
    neighbor 23.0.0.2 next-hop-self

    and that its…

  35. Mark Mahan says:

    R2 has equal cost OSPF routes to the BGP next hop of 13.0.0.1 through R1 and R3. changing the OSPF cost on R2 for F0/0 to >1 will fix it.

  36. Mark Mahan says:

    Or next hops self on R3….

  37. Ahmed Harras says:

    R2 is learning the BGP route from R3 with next hop address 13.0.0.1. R2 has two equal OSPF routes to reach 13.0.0.1, one through the 13.0.0.0 segment, and the other one through the 10.0.0.0 segment. The solution is to configure R3 to set next hop self for the BGP routes advertised to R2.

  38. SlapChop says:

    The problem here is the recursive lookup performed on R2 for 1.1.1.0, next-hop for this network will be resolved into two destinations learned over OSPF:

    O 13.0.0.0 [110/2] via 23.0.0.3, 00:48:16, FastEthernet0/1
    [110/2] via 10.0.0.1, 00:49:19, FastEthernet0/0

    which results in per-flow load-balancing (assuming enabled CEF). The easiest fix would be to eliminate OSPF all together and add next-hop-self on R2 and R3 for iBGP session.

  39. Matthias says:

    Hey Bob! Try to use the next-hop-self feature for you iBGP-session. This will solve your problem.

    Well, but why? What’s wrong wrong with the original next hop? Have a closer look!

  40. Yasser says:

    Hi Keith,

    Is the loopback on R1 meant to be advertised into OSPF? Because if so, the OSPF route should pre-empt the BGP route on R2?

    If it wasn’t meant to be advertised, then i would think the problem is as follows:

    BGP config looks correct, but the next-hop IP sent along with the BGP update, will recurse via IGP (OSPF in this case). OSPF is telling the forwarding engine to install 2 routes due to equal cost. And this, i think is the problem. On R1, the route back to 23.0.0.0 (the source of the AS23 traffic) has 2 equal cost routes and will alternate per flow.

    Solution:

    Can’t simply “cost” the 10.0.0.0 link higher in OSPF as all traffic would go via the 13.0.0.0 link. Probably set up a GRE tunnel between R3 and R1 and set the IP next-hop in bgp for 1.1.1.0 to the tunnel interface on R3 and R1. Advertise the tunnel IP into OSPF so that R2 has reachability to the tunnel interface.

  41. Shantanu Mondal says:

    I think, R2′s IGP route is the problem. R2 has 1.1.1.0/24 going towards 13.0.0.1. That’s all nice and good. But if you do a sh ip route for 13.0.0.0/24 (aka recursive route lookup) you see that you have two paths in the routing table (via OSPF); one to R1 and one to R3. Naturally it will load balance the traffic.

    Change the OSPF cost on R2′s Fa0/1′s link to 100 and R2 will take only fa0/0 to reach 13.0.0.1.

  42. Marcel Lammerse says:

    In the AS23-AS1 direction :

    R3 is missing the next-hop-self parameter on the ‘neighbor 23.0.0.2′ command, to correctly set the desired next-hop to be R3. In the current configuration, R1 is the next-hop, which is reached from R2 through OSPF and therefore takes the link between R1 and R2.

    In the AS1-AS23 direction :

    R1 is not learning any information from R2 and R3 through BGP, which would need to be fixed if BGP is the protocol to be used for this traffic engineering exercise.

    One way of doing that is to originate the 23.0.0.0/24 network in BGP rather than OSPF and AS-path prepend the AS23 prefix towards AS1 on the R1-R2 link.

    Other methods could be, setting the MED on the AS23 prefix towards AS1 (keeping in mind that bgp deterministic-med is required) or setting the local preference for the AS23 prefix on R1.

  43. Amit says:

    Yes, the issue is that when R2 performs recursive lookup for 13.0.0.0, it finds two paths to it- one through R3 and one through R1. Hence, R2-R1 will see traffic to 1.1.1.0/24 network.

    Solution is to increase the cost of R1-R2 link.

  44. Naasif says:

    I agree with the dude Yasser. Just a difference that I would apply is to not advertise the Tunnel Interface into OSPF (otherwise problem of load balancing will occur) but rather have some static routes from R3 to the 1.1.1.0/24 network pointing to the Tunnel interface as well as a static on R2 for the Tunnel network to ensure next hop reachability via BGP.

  45. Bobby says:

    Hi Keith,

    Here IGP is getting preference over BGP. Since we are using OSPF, it is performing load balancing between the two links.

    In order to use 13.0.0.0/24 segment, increase the cost of this link to 255 & decrease the cost of 10.0.0.0/24 segment to 0.

    This will solve our problem.

  46. Rajesh Kumar Prasad says:

    The problem is the next hop address at R2 that is 13.0.0.1. The R2 is getting the next hop address from both R1 and R3. Whenever packet leave for 1.1.1.1 the traffic gets load balanced accroding to CEF configured on the R2 and some traffic goes through directly to R1 and some via R3.

    The make the packet move from R2-R3-R1 only we need to stop the next hop calulation of router R2. So the next hop would be 23.0.0.3 not 13.0.0.1.

    And the packet will go to 1.1.1.1/24 via R2-R3-R1 only

    configuration;

    At router R3
    #router bgp 23
    #neighbour 23.0.0.2 next-hop-self

    This is the right solution for Bob to reach to network 1.1.1.1/24 via R2-R3-R1

  47. Prasad Desai says:

    The problem is due to OSPF, the BGP next-hop is R1 only which is learned via R3 on R2, but due to OSPF metric the physical traffic is going via R2-R1 link, insted on R2-R3-R1…

    Regards,
    Prasad Desai
    CCIE R&S

  48. Shailen Mehta says:

    The problem will be reolved by adding one command to R3

    neighbor 23.0.0.2 next-hop-self

    R3 will advertise eBGP routes to iBGP peer R2.

    Another issue is AD of ospf is 110 and iBGP is 200.Add command at R2
    to change AD.

    bgp distance 20 20 20

    Now next hop is accesible via iBGP which is R3 and traffic will take always use link between R1 and R3 i. e. 13.0.0.0/24 segment

  49. mark says:

    R2 and R2 needs the next-hop-self configured so that when it advertises the route out to its iBGP neighbors, it will replace the next-hop with its own IP.
    Then for the R1-R2 and R1-R3 links we can implement ‘passive interface’ on OSPF. We don’t need to run an IGP on eBGP links.

  50. Thanks to everyone who responded. A solution is provided at the end of the article. Bob is grateful!

  51. Yasser says:

    It satisfies the requirement to do next-hop-self on R3 towards R2, but the traffic from R1 back to the 23.0.0.0/24 subnet would still cause asymmetrical routing, right? Is there a way to fix the asymmetrical routing issue using OSPF, but only for traffic sourced from 1.1.1.0/24? (Not using policy-based routing)

  52. Yasser -

    Great question. We could do exactly that, with the following:

    The interface R1 uses to connect to R2 is Fa0/0. If we create a prefix-list that denies the 23.0.0.0/24 and permits everything else, we can apply that prefix-list as part of a inbound distribute-list on R1 Fa0/0, and only that single route learned on that single interface will be denied into the routing table on R1.

    ip prefix-list NO-23-FROM-R2 seq 5 deny 23.0.0.0/24
    ip prefix-list NO-23-FROM-R2 seq 10 permit 0.0.0.0/0 le 32

    router ospf 1
    distribute-list prefix NO-23-FROM-R2 in fa0/0

    This would cause R1 to have only 1 route for the return to 23.0.0.0

    Another option would be to use a route-map, that filters that specific route, based on the router that originated the LSA. Either option would work for solving the asymmetrical return.

    Thanks for the question.

  53. Hudson says:

    I am having a problem with that configuration. After changing the OSPF cost on R2 interface (10.0.0.0), R2 prefers the path through R3. I set next-hop-self on R3…

    But R2 chooses to install the OSPF route rather an a BGP one in the routing table:

    BGP table version is 55, local router ID is 23.0.0.2
    Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
    r RIB-failure, S Stale
    Origin codes: i – IGP, e – EGP, ? – incomplete

    Network Next Hop Metric LocPrf Weight Path
    r>i1.1.1.0/24 23.0.0.3 0 250 0 1 i
    r 10.0.0.1 0 0 1 i
    R2(config-if)#

    R2(config-if)#do sh ip route 1.1.1.0
    Routing entry for 1.1.1.0/24
    Known via “ospf 1″, distance 110, metric 3, type intra area
    Last update from 23.0.0.3 on FastEthernet1/1, 00:03:17 ago
    Routing Descriptor Blocks:
    * 23.0.0.3, from 1.1.1.1, 00:03:17 ago, via FastEthernet1/1
    Route metric is 3, traffic share count is 1
    R2(config-if)#

  54. Hudson -

    The OSPF route to 1.1.1.0 now will only use R3 (based on your config). R3 will forward the traffic to R1. That sounds like the original goal.

  55. Zahid says:

    I have put only one command in both R2 and R3, that is redistribute connected and it’s solved.

 

Leave a Reply

Categories

CCIE Bloggers