Monday, December 19, 2016

Advanced Cisco Routing: BGP Route Reflectors

Advanced Cisco Routing: BGP Route Reflectors Suppose your network uses BGP as your Interior Gateway Protocol (IGP). Because iBGP will not share routes learned across one interface through a second interface (i.e., if R1 learns a route from R2, it will not share that route with R3, R4 or R5), your network must be a full mesh, like so:


While this is very robust, it is neither scalable nor efficient. Given a network of n nodes, then you must create n(n - 1) physical connections, with an IP address on each side of the connection, with a "neighbor ... remote-as..." and "neighbor ... activate" statement in the BGP config, and a "network ... mask ..." statement in the BGP config. When you are talking about just a handful of routers, that's not too terribly bad, but as your network grows, that starts to become rather cumbersome. For example, here are the interface configs and BGP config for R1 in the full-mesh network shown above:

interface Loopback0
ip address 10.254.254.1 255.255.255.255
!
interface Loopback10
ip address 192.168.1.1 255.255.255.0
!
interface FastEthernet1/0
ip address 10.1.2.1 255.255.255.252
!
interface FastEthernet1/1
ip address 10.1.3.1 255.255.255.252
!
interface FastEthernet2/0
ip address 10.1.4.2 255.255.255.252
!
interface FastEthernet2/1
ip address 10.1.5.2 255.255.255.252
!
router bgp 65510
bgp router-id 10.254.254.1
bgp log-neighbor-changes
neighbor 10.1.2.2 remote-as 65510
neighbor 10.1.3.2 remote-as 65510
neighbor 10.1.4.1 remote-as 65510
neighbor 10.1.5.1 remote-as 65510
!
address-family ipv4
neighbor 10.1.2.2 activate
neighbor 10.1.3.2 activate
neighbor 10.1.4.1 activate
neighbor 10.1.5.1 activate
no auto-summary
no synchronization
network 10.1.2.0 mask 255.255.255.252
network 10.1.3.0 mask 255.255.255.252
network 10.1.4.0 mask 255.255.255.252
network 10.1.5.0 mask 255.255.255.252
network 10.254.254.1 mask 255.255.255.255
network 192.168.1.0
exit-address-family
!

Ugh...that's a lot of configuration, and a lot of chances to make a mistake...and that's only on a network with 5 routers! The SMALL ISP that I used to work for had 25 to 30 routers on our Internet service network. Imagine what a full-mesh config on one of those routers would look like!

To solve this problem, the designers of the BGP protocol created the concept of "route reflectors." Route Reflectors do exactly what it sounds like: they "reflect" routes learned through one interface out other interfaces. As a result, it is no longer necessary to create a physical connection between every node in your network, nor is it necessary for every node in the network to be an iBGP peer with every other node in the network. This allows you to have a much simpler network topology:


R1 doesn't change at all -- we still have all four network interfaces up, and R1 is peering with every one of the other routers. However, R3 is the opposite extreme: the ONLY router to which R3 is connected is R1, and consequently, there is now only 1 peering statement in the BGP config. As you can see, we no longer have the full network topology stored in our routing tables:

R3#sho ip route
Gateway of last resort is not set

     10.0.0.0/8 is variably subnetted, 6 subnets, 2 masks
C       10.1.3.0/30 is directly connected, FastEthernet2/0
C       10.254.254.3/32 is directly connected, Loopback0
B       10.1.2.0/30 [200/0] via 10.1.3.1, 00:48:25
B       10.254.254.1/32 [200/0] via 10.1.3.1, 00:36:43
B       10.1.5.0/30 [200/0] via 10.1.3.1, 00:48:25
B       10.1.4.0/30 [200/0] via 10.1.3.1, 00:48:25
B    192.168.1.0/24 [200/0] via 10.1.3.1, 00:48:25
C    192.168.3.0/24 is directly connected, Loopback10
R3#

We can resolve this by configuring R1 to be the route reflector for the other four routers:

R1:
R1(config)#router bgp 65510
R1(config-router)# neighbor 10.1.2.2 route-reflector-client
R1(config-router)# neighbor 10.1.3.2 route-reflector-client
R1(config-router)# neighbor 10.1.4.1 route-reflector-client
R1(config-router)# neighbor 10.1.5.1 route-reflector-client
R1(config-router)# bgp cluster-id 1

At this point, all of the other routers should have all the same routes that R1 has (only R3 shown):

R3#sho ip route
Gateway of last resort is not set

B    192.168.4.0/24 [200/0] via 10.1.4.1, 00:02:15
B    192.168.5.0/24 [200/0] via 10.1.5.1, 00:02:15
     10.0.0.0/8 is variably subnetted, 11 subnets, 2 masks
B       10.254.254.2/32 [200/0] via 10.1.2.2, 00:02:15
C       10.1.3.0/30 is directly connected, FastEthernet2/0
C       10.254.254.3/32 is directly connected, Loopback0
B       10.1.2.0/30 [200/0] via 10.1.3.1, 00:02:20
B       10.254.254.1/32 [200/0] via 10.1.3.1, 00:02:20
B       10.2.4.0/30 [200/0] via 10.1.2.2, 00:02:15
B       10.2.5.0/30 [200/0] via 10.1.2.2, 00:02:15
B       10.254.254.4/32 [200/0] via 10.1.4.1, 00:02:15
B       10.1.5.0/30 [200/0] via 10.1.3.1, 00:02:20
B       10.254.254.5/32 [200/0] via 10.1.5.1, 00:02:15
B       10.1.4.0/30 [200/0] via 10.1.3.1, 00:02:21
B    192.168.1.0/24 [200/0] via 10.1.3.1, 00:02:21
B    192.168.2.0/24 [200/0] via 10.1.2.2, 00:02:16
C    192.168.3.0/24 is directly connected, Loopback10v R3#

You can see that we have routes now...but do they work? Let's find out:

R3#ping 192.168.1.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.1.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 12/20/28 ms
R3#ping 192.168.2.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.2.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/26/40 ms
R3#ping 192.168.3.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.3.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/1/4 ms
R3#ping 192.168.4.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/25/40 ms
R3#ping 192.168.5.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.5.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/36/40 ms
R3#

Yep, looks like it. Good job!

At this point, you may be thinking to yourself, "That's great...but if R1 goes off-line, most of your network goes off-line, too," and you'd be exactly right. Fortunately, it is possible to use more than one route reflector on your network. Let's make a few changes...


R1:
R1(config)#router bgp 65510
R1(config-router)#no network 10.1.4.0 mask 255.255.255.252
R1(config-router)#no network 10.1.5.0 mask 255.255.255.252
R1(config-router)#no neighbor 10.1.4.1 remote-as 65510
R1(config-router)#no neighbor 10.1.5.1 remote-as 65510
R1(config-router)#int fa2/0
R1(config-if)#shut
R1(config-if)#no ip addr
R1(config-if)#int fa2/1
R1(config-if)#shut
R1(config-if)#no ip addr

R2:
R2(config)#router bgp 65510
R2(config-router)#neighbor 10.1.2.1 route-reflector-client
R2(config-router)#neighbor 10.2.4.2 route-reflector-client
R2(config-router)#neighbor 10.2.5.1 route-reflector-client
R2(config-router)#bgp cluster-id 1

R4:
R4(config)#router bgp 65510
R4(config-router)#no neighbor 10.1.4.2 remote-as 65510
R4(config-router)#no network 10.1.4.0 mask 255.255.255.252
R4(config-router)#int fa1/1
R4(config-if)#shut
R4(config-if)#no ip addr

R5:
R5(config)#router bgp 65510
R5(config-router)#no neighbor 10.1.5.2 remote-as 65510
R5(config-router)#no network 10.1.5.0 mask 255.255.255.252
R5(config-router)#int fa1/0
R5(config-if)#shut
R5(config-if)#no ip addr

Keep in mind that it wasn't necessary to modify the configs on R1, R4 and R5 if we were only adding redundancy; I removed the links from R1 to R4 and R5 simply to show that BGP was still providing routes to these hosts via R2, but if you only wanted to add redundant routes to R2, then all you would have needed to do was add the "neighbor ... route-reflector-client" and "bgp cluster-id 1" statements to R2's BGP configuration. Anyway, let's make sure that we still have the routes we expect (only R5 shown):

R5#ping 192.168.1.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.1.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/29/36 ms
R5#ping 192.168.2.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.2.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8/16/36 ms
R5#ping 192.168.3.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.3.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8/45/76 ms
R5#ping 192.168.4.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/33/40 ms
R5#ping 192.168.5.1

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.5.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/4 ms
R5#

Looks good! With that, we'll wrap up this lesson, but in a later lesson, we'll discuss BGP confederations and peer groups.

Friday, December 16, 2016

Advanced Cisco Routing: A Full MPLS Network

A little over two years ago, I wrote a blog post about MPLS. In that lab, we built a very small, very simple MPLS network, where R1, R2 and R3 served as both our MPLS core and our "Provider Edge" routers. In the real world, you typically won't see this, as the requirements for a core and edge router are very different: the core is usually built on high-end chassis' with lots of memory and high-speed interfaces, whereas the edge routers are usually much smaller, much less expensive devices. Today, we will revisit the MPLS lab, breaking out the core ("P" -- "Provider"), edge ("PE" -- "Provider Edge") and customer ("CE" -- "Customer Edge") routers, and showing what is different amongst all three categories of routers.

Let's start with the core. Since I am mocking this lab up in GNS3 on a laptop with only 4GB of RAM, the core is going to be very simple: just two routers (P1 and P2), with a single Gig-E connection between them:



As I mentioned in the previous MPLS lab, we must be running CEF in order to run MPLS, so before anything else, make sure you've enabled CEF on the two core routers. Then, we'll put IP addresses on Gig3/0 on both P1 and P2, and configure a Loopback IP address, as well:

P1(config)#ip cef
P1(config)#int lo0
P1(config-if)#ip addr 10.254.254.1 255.255.255.255
P1(config-if)#no shut
P1(config-if)#int gig3/0
P1(config-if)#ip addr 10.0.0.1 255.255.255.252
P1(config-if)#no shut
P1(config-if)#

From this, I'm sure you can figure out how to configure P2 (basically, find any IP address that ends in ".1" and replace it with ".2"), so I won't belabor the point with a full config for P2 here.

Next, we will need to enable MPLS on Gig3/0 on both routers, and turn up OSPF so that our core and provider edge routers can route to each other:

P1(config-if)#int gig3/0
P1(config-if)#mpls ip
P1(config-if)#router ospf 42
P1(config-router)#router-id 10.254.254.1
P1(config-router)#network 10.0.0.0 0.0.0.3 area 0.0.0.0
P1(config-router)#redist conn sub
P1(config-router)#exit
P1(config)#

Once you've made the equivalent changes on P2, you should see the following output on both routers:

*Dec 16 11:40:01.311: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.2 on GigabitEthernet3/0 from LOADING to FULL, Loading Done
P1(config)#
*Dec 16 11:40:10.767: %LDP-5-NBRCHG: LDP Neighbor 10.254.254.2:0 (1) is UP
P1(config)#

With that, your P (core) routers are essentially done. You will need to turn up interfaces to connect to your PE (edge) routers -- don't forget the "mpls ip" command on those interfaces! -- and you'll need to establish routing between the P and PE routers, but that should be old hat by now.

Let's move on to the PE routers. We will connect PE1 to P1, and PE2 to P2, like so...:


...using the following configs:
PE1:
PE1(config)#ip cef
PE1(config)#router ospf 42
PE1(config-router)#router-id 10.254.254.3
PE1(config-router)#int lo0
PE1(config-if)#ip addr 10.254.254.3 255.255.255.255
PE1(config-if)#no shut
PE1(config-if)#ip ospf 42 area 0.0.0.0
PE1(config-if)#int gig2/0
PE1(config-if)#mpls ip
PE1(config-if)#ip addr 10.1.1.2 255.255.255.252
PE1(config-if)#no shut
PE1(config-if)#ip ospf 42 area 0.0.0.0

...and...:

PE2:
PE2(config)#ip cef
PE2(config)#router ospf 42
PE2(config-router)#router-id 10.254.254.4
PE2(config-router)#int lo0
PE2(config-if)#ip addr 10.254.254.4 255.255.255.255
PE2(config-if)#ip ospf 42 area 0.0.0.0
PE2(config-if)#no shut
PE2(config-if)#int gig2/0
PE2(config-if)#mpls ip
PE2(config-if)#ip addr 10.2.1.2 255.255.255.252
PE2(config-if)#ip ospf 42 area 0.0.0.0
PE2(config-if)#no shut

Once you've gotten this far, you should see output similar to this as the various adjacencies come up:

*Dec 16 11:58:31.063: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.2 on GigabitEthernet2/0 from LOADING to FULL, Loading Done
*Dec 16 11:58:41.499: %LDP-5-NBRCHG: LDP Neighbor 10.254.254.2:0 (1) is UP

Let's check our routing tables and LDP database to make sure everything is working as expected:

PE1#sho ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route

Gateway of last resort is not set

     10.0.0.0/8 is variably subnetted, 7 subnets, 2 masks
O E2    10.254.254.2/32 [110/20] via 10.1.1.1, 00:10:26, GigabitEthernet2/0
C       10.254.254.3/32 is directly connected, Loopback0
O       10.2.1.0/30 [110/3] via 10.1.1.1, 00:10:26, GigabitEthernet2/0
C       10.1.1.0/30 is directly connected, GigabitEthernet2/0
O       10.0.0.0/30 [110/2] via 10.1.1.1, 00:10:26, GigabitEthernet2/0
O E2    10.254.254.1/32 [110/20] via 10.1.1.1, 00:10:26, GigabitEthernet2/0
O       10.254.254.4/32 [110/4] via 10.1.1.1, 00:05:29, GigabitEthernet2/0
PE1#sho mpls ldp neigh
    Peer LDP Ident: 10.254.254.1:0; Local LDP Ident 10.254.254.3:0
    TCP connection: 10.254.254.1.646 - 10.254.254.3.53411
    State: Oper; Msgs sent/rcvd: 22/21; Downstream
    Up time: 00:10:33
    LDP discovery sources:
      GigabitEthernet2/0, Src IP addr: 10.1.1.1
        Addresses bound to peer LDP Ident:
          10.0.0.1        10.254.254.1    10.1.1.1        
PE1#sho mpls ldp bindings
  lib entry: 10.0.0.0/30, rev 8
    local binding:  label: 17
    remote binding: lsr: 10.254.254.1:0, label: imp-null
  lib entry: 10.1.1.0/30, rev 4
    local binding:  label: imp-null
    remote binding: lsr: 10.254.254.1:0, label: imp-null
  lib entry: 10.2.1.0/30, rev 6
    local binding:  label: 16
    remote binding: lsr: 10.254.254.1:0, label: 17
  lib entry: 10.254.254.1/32, rev 12
    local binding:  label: 19
    remote binding: lsr: 10.254.254.1:0, label: imp-null
  lib entry: 10.254.254.2/32, rev 10
    local binding:  label: 18
    remote binding: lsr: 10.254.254.1:0, label: 16
  lib entry: 10.254.254.3/32, rev 2
    local binding:  label: imp-null
    remote binding: lsr: 10.254.254.1:0, label: 18
  lib entry: 10.254.254.4/32, rev 14
    local binding:  label: 20
    remote binding: lsr: 10.254.254.1:0, label: 19
PE1#

With this, you now have a fully-functional "service provider" MPLS network. Your core is up, your PE routers are up, they are all sharing routes, and they have created LDP bindings between the routers. Sweet! All we need now are some customers to connect to our network so that the provider edge routers can start earning their keep ;)

This is where things start to get fun. Suppose the CIO for Perpetual Motion, Inc., an alternative energy provider, approaches you for connectivity across your network. You will turn up an interface for Perpetual Motion on both PE1 and PE2, and create a VRF to isolate Perpetual Motion's network instance from both your own network, as well as from any future customers' networks. Your network now looks like this...:



...with the following config changes on PE1 and PE2:
PE1:
PE1(config)#ip vrf PERPETUAL
PE1(config-vrf)#rd 65000:20
PE1(config-vrf)#route-target both 65000:20
PE1(config-vrf)#int fa0/0
PE1(config-if)#no ip addr
PE1(config-if)#no shut
PE1(config-if)#int fa0/0.20
PE1(config-subif)#encap dot1q 20
PE1(config-subif)#ip vrf forwarding PERPETUAL
PE1(config-subif)#ip addr 100.64.20.1 255.255.255.252
PE1(config-subif)#no shut

PE2:
PE2(config)#ip vrf PERPETUAL
PE2(config-vrf)#rd 65000:20
PE2(config-vrf)#route-target both 65000:20
PE2(config-vrf)#int fa0/0
PE2(config-if)#no ip addr
PE2(config-if)#no shut
PE2(config-if)#int fa0/0.20
PE2(config-subif)#encap dot1q 20
PE2(config-subif)#ip vrf forwarding PERPETUAL
PE2(config-subif)#ip addr 100.64.20.5 255.255.255.252
PE2(config-subif)#no shut

It isn't necessary to turn up a dot-1q encapsulated sub-interface here. We just as easily could turn up a new physical interface for every customer...until we ran out of physical interfaces. Since this is a lab in GNS3, it's not very likely that we would, in fact, run out of physical interfaces (unless you are far more ambitious than I, in which case, you do you!). However, this is pretty much how we provided service to customers at one of my former places of employment, given that SW1 and SW2 could be either actual Ethernet switches or some other kind of Metro-Ethernet network extender (Actelis, Accedian, AdTran, Cisco ME-3400, etc.) or combination thereof. Once the customer configures their routers, we should have point-to-point connectivity between CE1 and PE1, and between CE2 and PE2:

CE1:
CE1#sho run
interface Loopback0
ip address 192.168.254.1 255.255.255.255
ip ospf 1138 area 0.0.0.0
!
interface FastEthernet0/0
ip address 192.168.1.1 255.255.255.0
ip ospf 1138 area 0.0.0.0
!
interface FastEthernet1/1
ip address 100.64.20.2 255.255.255.252
ip ospf network point-to-point
ip ospf 1138 area 0.0.0.0
!
router ospf 1138
router-id 192.168.254.1
log-adjacency-changes
passive-interface FastEthernet0/0
passive-interface Loopback0
!
^c CE1#ping 100.64.20.1
Sending 5, 100-byte ICMP Echos to 100.64.20.1, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 20/24/32 ms
CE1#

All that is left now is to set up routing between CE1 and CE2. On PE1 and PE2, we will set up an instance of OSPF to accept routes from CE1 and CE2, respectively:

PE1(config-subif)#router ospf 20 vrf PERPETUAL
PE1(config-router)#router-id 100.64.20.1
PE1(config-router)#network 100.64.20.0 0.0.0.3 area 0.0.0.0
PE1(config-subif)#
*Dec 16 14:09:43.579: %OSPF-5-ADJCHG: Process 20, Nbr 192.168.254.1 on FastEthernet0/0.20 from LOADING to FULL, Loading Done
PE1(config-subif)#

CE1(config-if)#router ospf 1138
CE1(config-router)#router-id 100.64.20.2
CE1(config-router)#network 100.64.20.0 0.0.0.3 area 0.0.0.0
CE1(config-router)#int lo0
CE1(config-if)#ip ospf 1138 area 0.0.0.0
CE1(config-if)#int fa0/0
CE1(config-if)#ip ospf 1138 area 0.0.0.0

Now, does it work?

PE1#sho ip route vrf PERPETUAL
...
Gateway of last resort is not set

     100.0.0.0/30 is subnetted, 1 subnets
C       100.64.20.0 is directly connected, FastEthernet0/0.20
     192.168.254.0/32 is subnetted, 1 subnets
O       192.168.254.1 [110/2] via 100.64.20.2, 00:01:40, FastEthernet0/0.20
O    192.168.1.0/24 [110/2] via 100.64.20.2, 00:01:30, FastEthernet0/0.20
PE1#

Looks good! We've got the loopback and Fa0/0 IP addresses in our routing table, so as you can see, all we need to do to set up a customer routing instance on our PE routers is to append "vrf <VRF NAME> to the end of the "router ospf..." statements.

The last step is to set up a multiprotocol BGP process between PE1 and PE2 so that they can share the customer routes between them, then configure redistribution to the OSPF process in the customer VRF. If that sounds complicated, don't worry; it's really not terribly difficult:

PE1:
PE1(config)#router bgp 65000
PE1(config-router)#no synch
PE1(config-router)#neighbor 10.254.254.4 remote-as 65000
PE1(config-router)#neighbor 10.254.254.4 update-source Loopback0
PE1(config-router)#address-family vpnv4
PE1(config-router-af)#neighbor 10.254.254.4 activate
PE1(config-router-af)#neighbor 10.254.254.4 send-community extended
PE1(config-router-af)#exit
PE1(config-router)#address-family ipv4 vrf PERPETUAL
PE1(config-router-af)#redist ospf 20 vrf PERPETUAL
PE1(config-router-af)#no synch
PE1(config-router-af)#exit
PE1(config-router)#exit
PE1(config)#router ospf 20 vrf PERPETUAL
PE1(config-router)#redist bgp 65000 subnets

PE2:
PE2(config)#router bgp 65000
PE2(config-router)#no sync
PE2(config-router)#neighbor 10.254.254.3 remote-as 65000
PE2(config-router)#neighbor 10.254.254.3 update-source Loopback0
PE2(config-router)#address-family vpnv4
PE2(config-router-af)#neighbor 10.254.254.3 activate
PE2(config-router-af)#neighbor 10.254.254.3 send-community extended
PE2(config-router-af)#exit
PE2(config-router)#address-family ipv4 vrf PERPETUAL
PE2(config-router-af)#redist ospf 20 vrf PERPETUAL
PE2(config-router-af)#no sync
PE2(config-router-af)#exit
PE2(config-router)#exit
PE2(config)#router ospf 20 vrf PERPETUAL
PE2(config-router)#redist bgp 65000 sub
PE2(config-router)#exit


Let's check our CE routers and see if they are propagating routes correctly:

CE1#sho ip route
Gateway of last resort is not set

     100.0.0.0/30 is subnetted, 2 subnets
C       100.64.20.0 is directly connected, FastEthernet1/1
O IA    100.64.20.4 [110/2] via 100.64.20.1, 00:02:43, FastEthernet1/1
     192.168.254.0/32 is subnetted, 2 subnets
O IA    192.168.254.2 [110/3] via 100.64.20.1, 00:02:43, FastEthernet1/1
C       192.168.254.1 is directly connected, Loopback0
C    192.168.1.0/24 is directly connected, FastEthernet0/0
O IA 192.168.2.0/24 [110/3] via 100.64.20.1, 00:02:43, FastEthernet1/1
CE1#

CE2:
CE2#sho ip route
Gateway of last resort is not set

     100.0.0.0/30 is subnetted, 2 subnets
O IA    100.64.20.0 [110/2] via 100.64.20.5, 00:02:27, FastEthernet1/1
C       100.64.20.4 is directly connected, FastEthernet1/1
     192.168.254.0/32 is subnetted, 2 subnets
C       192.168.254.2 is directly connected, Loopback0
O IA    192.168.254.1 [110/3] via 100.64.20.5, 00:02:27, FastEthernet1/1
O IA 192.168.1.0/24 [110/3] via 100.64.20.5, 00:02:27, FastEthernet1/1
C    192.168.2.0/24 is directly connected, FastEthernet0/0
CE2#

Yep, on CE1, I can see the Loopback and Fa0/0 IP addresses from CE2, and vice versa. It looks like MPLS is working properly, and like our routing processes are sharing routes in the proper VRF's.

By configuring the P, then PE and CE routers one at a time, it should be fairly obvious how each class of router differs from the others (at least, from a configuration standpoint). The CE routers are the simplest of all, in that they are completely agnostic about the underlying architecture of the service provider network. All they need to do is set up routing, either with a dynamic routing protocol like OSPF or via static routes, with the provider; no special configuration is required on the CE routers at all. Next, in order of complexity, are the P routers. The only additional configuration they require is the "mpls ip" statement in any interface that will be part of the MPLS core. Most of the magic happens in the PE routers, which is reflected in the relative complexity of the PE routers' configs. This is where we create the VRFs, set the route distinguisher and route targets, configure the VRF-aware routing protocols, and set up BGP to redistribute the routes across the core.

Advanced Cisco Routing: DMVPN -- Point-to-Multipoint VPN Tunneling

A few years ago, I used to work for a service provider that operated in rural Alaska. By lower-48 standards, our network wasn’t terribly large — or at least, the logical topology wasn’t terribly large; the physical topology covered a rather large geographical region. Our major hub was a huge, bustling metropolis of about 5,000 (!) people.

This site also was where we located our hub router for the network. We had an extension site in Anchorage (naturally, since that was where most of our employees lived and worked), the hub site, a PoP at the hub site hanging off the hub router, and then multiple PoPs scattered across our service area, also linked off of the hub router. Because our own management network was built across our own service-provider network, we set up VPN tunnels from the hub router to each and every one of our sub-tended sites to provide secure management access to our network. Conceptually, it was a very simple model (and honestly, it might have been the only model our equipment would support at the time), but if you think that configuring a separate VPN tunnel for each site could be a bit of a chore, you are exactly right.

As I’m sure you’ve guessed by now, there is A Better Way to achieve these goals, a way that makes configuring and managing multiple sites sub-tended off of a single hub much less time consuming. Allow me to introduce you to DMVPN’s (Dynamic Multipoint VPN’s). As always, we’ll start with our network diagram:



R1 through R4 will be our management network, with R1 being the hub and R2 — R4 being the spokes. R5, R6 and R7 are the service provider network. In the real network that I managed, we had static, default routes on R1 through R4, and ran OSPF on our provider network. In this lab, we will run OSPF internally on both networks, and peer the management and provider networks with BGP, since that is a more common scenario for most people (being both provider and customer is fairly unusual). Also, running OSPF over a DMVPN topology introduces a few wrinkles that are worth covering, but I’m getting ahead of myself ;)

For addressing, I’ll be using 100.64.x.x addresses in place of public IP address ranges, and 192.168.x.0/24 for the inside interfaces on my management network. I’ll use 172.16.x.x IP space for the tunnel addressing. On the "Internet" routers, I’ll use 100.64.254.x/32 for the Loopback IP addresses, while 10.254.254.x/32 will be the Loopback IP addresses on my management routers.

Still with me? Good! Let’s start by setting up basic connectivity to each router, starting with the Internet routers (since there is nothing new on them):

R5:
interface Loopback0
ip address 100.64.254.5 255.255.255.255
no shut
!
interface FastEthernet0/0
ip address 100.64.0.1 255.255.255.252
no shut
!
interface FastEthernet0/1
ip address 100.64.0.5 255.255.255.252
no shut
!
interface FastEthernet1/0
ip address 100.64.0.9 255.255.255.252
no shut
!
router ospf 1138
router-id 100.64.254.5
log-adjacency-changes
passive-interface Loopback0
network 100.64.0.0 0.0.0.3 area 0.0.0.0
network 100.64.0.4 0.0.0.3 area 0.0.0.0
network 100.64.0.8 0.0.0.3 area 0.0.0.0
network 100.64.254.5 0.0.0.0 area 0.0.0.0
!

R6 and R7 are similar, and since there is nothing new here, I’ll skip those configs.

We’ll go ahead and configure the IP addressing on the FastEthernet and Loopback interfaces of R1, R2, R3 and R4 next. Again, nothing new, and nothing exciting, so I won’t belabor the config here, but make sure R1, R2, R3 and R4 can ping their respective gateways before proceeding. Once point-to-point connectivity between the management network and the service provider network is working, we’ll set up BGP peering between R1 and R5, R2 and R6, R3 and R5, and finally, R4 and R7:

R1:
router bgp 65511
bgp router-id 100.64.0.2
network 100.64.0.0 mask 255.255.255.252
neighbor 100.64.0.1 remote-as 65512
neighbor 100.64.0.1 activate

R5: router bgp 65512
bgp router-id 100.64.0.1
network 100.64.0.0 mask 255.255.255.252
neighbor 100.64.0.2 remote-as 65511
neighbor 100.64.0.2 activate
redist ospf 1138 metric 120
!
router ospf 1138 redist bgp 65512 sub metric 120
!

As you can see, the configurations are almost identical, aside from swapping the AS’ in the "router bgp..." and "neighbor..." statements, and swapping the IP addresses in the "bgp router-id..." and "neighbor..." statements. Also, on R5, we are redistributing the routes learned via BGP into OSPF. We are also redistributing OSPF routes into the BGP process. R6 and R7 will be configured similarly to R5, and R2, R3 and R4 will be configured similarly to R1. Again, nothing new so far.

But now, things will start to get interesting. Let’s set up the GRE tunnel on R1:

interface Tunnel0
ip address 172.16.0.1 255.255.255.0
ip nhrp map multicast dynamic
ip nhrp network-id 1
tunnel source 100.64.0.2
tunnel mode gre multipoint
no shut
!

Just like a normal GRE tunnel, we start with "interface Tunnel <blah>", and assign an IP address to the tunnel interface. Unlike a normal tunnel interface, we are assigning a /24. You can use whatever size subnet you want, but since it is a multipoint tunnel, it should probably be larger than a /30. The "tunnel source..." statement should look familiar also (if not, see the GRE Tunnel lab for a refresher).

However, there are a few differences between a DMVPN tunnel config and a standard, point-to-point tunnel config. One of the first things you’ll likely notice is that we have not specified any of the opposite endpoints. Instead, we used the command "tunnel mode gre multipoint" to explicitly state that we are creating a point-to-multipoint (hub-and-spoke) network. That’s the "dynamic" portion of the "Dynamic Multipoint VPN. Basically, the hub accepts tunnel requests from multiple spoke routers, and automatically establishes the tunnels on demand.

You'll also notice that, even though there are three spoke routers, the hub only has one tunnel interface. That's the "Multipoint" portion of the acronym ;) This raises a very interesting question. In a point-to-point circuit, it is trivial to determine the IP address of the next hop (if you are on a /30 or /31 network, there are only two usable IP addresses, and you are using one of them, right?). However, in a multipoint network, your tunnel interfaces are in a larger subnet. In our example, we are using a /24, meaning the other end of the tunnel could be any one of 253 possible IP addresses! How does the hub router know which IP address corresponds to which tunnel? If you look at the next two lines of the tunnel config, you’ll see the two "ip nhrp..." statements. NHRP ("Next Hop Resolution Protocol," see also CCIE or Null! for a good discussion on the topic) is the protocol that we use to determine the IP address of the other side of the multipoint tunnel. In much the same way that ARP maps IP addresses to Ethernet addresses, NHRP allows our routers to dynamically map IP addressing to the multipoint tunnels. In the "ip nhrp map multicast dynamic" statement, we are telling NHRP to dynamically create these mappings for our multipoint tunnels. However, you might have multiple tunnels on any given router, so by specifying different network ID's with the "ip nhrp network-id ..." statement, you can create multiple hub-and-spoke networks without them conflicting with one another. That’s it for the hub router. That wasn't too bad, was it?

We’ll use R2 as an example of the spoke router configuration; R3 and R4 will be very similar:

R2:
interface Tunnel0
ip address 172.16.0.2 255.255.255.0
ip nhrp map 172.16.0.1 100.64.0.2
ip nhrp map multicast 100.64.0.2
ip nhrp network-id 1
ip nhrp nhs 172.16.0.1
tunnel source 100.64.0.14
tunnel mode gre multipoint
no shut
!

Like the hub router, the spoke router contains the "ip address...," "tunnel source..." and "tunnel mode gre multipoint commands." It also contains a handful of "ip nhrp..." statements, but they are slightly more complex. First, the spoke router must know how to reach the hub router in order to send the tunnel request, so we start by telling the tunnel to create a connection to the IP address of the hub router’s outside interface (the tunnel source on the hub router). In other words, to reach 172.16.0.1 (the tunnel IP address on R1) use 100.64.0.2 (Fa1/0 on R1). Next, "ip nhrp map multicast 100.64.0.2" sets 100.64.0.2 (Fa1/0 on R1) as the destination for multicast or broadcast packets sent across the non-broadcast, multi-access, or NBMA, (ie., the DMVPN) network. If multicast or broadcast packets are sent across the NBMA network, R1 is responsible for forwarding them to other hosts participating in the network, so we are telling the tunnel interface to forward those packets to R1. The last new command on the spoke router is the "ip nhrp nhs 172.16.0.1" statement. With this line, we are telling the spoke router to use the "next-hop server" to forward traffic across the NBMA network.

Substitute the appropriate values for the IP address and tunnel source on R3 and R4, and you should have working tunnels between R1 and each of the spoke routers. To verify this, use the "sho dmvpn" command:

R2#sho dmvpn
Legend: Attrb --> S - Static, D - Dynamic, I - Incomplete
    N - NATed, L - Local, X - No Socket
    # Ent --> Number of NHRP entries with same NBMA peer
    NHS Status: E --> Expecting Replies, R --> Responding
    UpDn Time --> Up or Down Time for a Tunnel
==========================================================================

Interface: Tunnel0, IPv4 NHRP Details

IPv4 NHS: 172.16.0.1 RE
Type:Spoke, Total NBMA Peers (v4/v6): 1

# Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb    Target Network
----- --------------- --------------- ----- -------- ----- -----------------
    1     100.64.0.2      172.16.0.1    UP 01:21:34    S      172.16.0.1/32


R2#

As you can see from the snippet of output above, R2 has now established a tunnel connection to R1 (tunnel state is "Up" and next to "IPV4 NHS, we have the IP address of int tunnel0 on R1, followed by the flags "RE," verifying that the tunnel is responding and expecting replies). After duplicating the tunnel config on R3 and R4, you should see similar output on those routers, although each router will only show the connection to R1. This is a point-to-multipoint network, meaning that R2 cannot talk directly to R3 without going through R1 (sort of...actually R1 can broker connections between the spokes, but honestly, I’m not comfortable enough with the topic to go there yet). Assuming that you have copied the modified version of R2’s tunnel config to R3 and R4, you should have a completed point-to-multipoint VPN network now (w00t!). However, if you try to ping from the inside interface of R2, R3 or R4 to the inside interface of R1, you will most likely not be thrilled with the result:

R4#ping 192.168.1.1 source 192.168.4.1
<...snip...>
.....
Success rate is 0 percent (0/5)
R4#

Any ideas why? Of course! We haven’t set up any routing between the inside networks. When we configured BGP peering between the management network and the service provider network, we only advertised the outside interfaces of R1, R2, R3 and R4, since our service provider should not be aware of the inner workings of our network (unless we are using MPLS). In order to actually send traffic across the VPN tunnels, we need to enable a routing protocol over the tunnels. Easy enough, right? It should look something like this...:

R1:
router ospf 42
router-id 10.254.254.1
passive-interface default
no passive-interface Tunnel0
network 10.254.254.1 0.0.0.0 area 0.0.0.0
network 172.16.0.0 0.0.0.255 area 0.0.0.0
network 192.168.1.0 0.0.0.255 area 0.0.0.0

Again, after making the appropriate substitutions for the router-id and the advertised networks, we’ll make the same changes on the spoke routers, and...what is going on here?

R1(config)#
*Dec 13 15:44:15.947: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.3 on Tunnel0 from LOADING to FULL, Loading Done
*Dec 13 15:44:16.391: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.3 on Tunnel0 from FULL to DOWN, Neighbor Down: Adjacency forced to reset
R1(config)#
*Dec 13 15:44:20.483: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.2 on Tunnel0 from INIT to DOWN, Neighbor Down: Adjacency forced to reset
*Dec 13 15:44:20.535: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.3 on Tunnel0 from EXSTART to DOWN, Neighbor Down: Adjacency forced to reset
*Dec 13 15:44:20.699: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.4 on Tunnel0 from LOADING to FULL, Loading Done
R1(config)#
*Dec 13 15:44:22.735: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.4 on Tunnel0 from FULL to DOWN, Neighbor Down: Adjacency forced to reset
R1(config)#
*Dec 13 15:44:25.307: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.3 on Tunnel0 from LOADING to FULL, Loading Done
*Dec 13 15:44:25.555: %OSPF-5-ADJCHG: Process 42, Nbr 10.254.254.3 on Tunnel0 from FULL to DOWN, Neighbor Down: Adjacency forced to reset
R1(config)#

Why is OSPF flapping?!?!

This is where I start to get in a little bit over my head. If I understand correctly, the issue is that OSPF is aware of the type of network configured across each OSPF-aware link. In this case, we have configured a NBMA network via multipoint GRE tunnels, but OSPF considers GRE tunnels to be a point-to-point network. While we had OSPF only running between R1 and R2, this was fine, but as soon as OSPF sees two neighbors across a single "point-to-point" link, it gets confused (understandably so!) and drops the neighbor relationship. To resolve this problem, at least under certain circumstances, make the following change to R1, R2, R3 and R4:
int tunnel 0
ip ospf network point-to-multipoint

Now that OSPF understands that int Tunnel0 is actually part of a multipoint network, it will allow all three spoke routers to participate in OSPF across the tunnel. Run "sho ip route" and "sho ip ospf neighbor" to verify that everything is working as expected (it should be), and you should be good to go!