The underlay topology in this environment1 made it safe for me to anycast the DMVPN hubs, so that's what I did. This made the "connect to the nearest hub" problem easy to solve, but introduced some new complexity.
- There are many hub sites.
- Spokes will be network-near exactly one hub site.
- Latency between hub sites is high.
- Bandwidth between hub sites is low.
- Spoke routers don't know where they are in the network.
- Spoke routers must connect only to the nearest hub.
Hub Anycast Interface
Each DMVPN router has a loopback interface with address 192.0.2.0/32 assigned to the front-door VRF. It's configured something like this:
interface loopback 192020
description DMVPN hub anycast target
ip vrf forwarding LTE_TRANSIT
ip address 192.0.2.0 255.255.255.255
The 192.0.2.0 /32 prefix was redistributed into the IP backbone. If this device were to fail, then the next-nearest instance of 192.0.2.0 would be selected by the IGP.
Spoke Configuration
Spokes look pretty much exactly like the ones in the DMVPN via DHCP post. They're all looking for a hub at 192.0.2.0. The only interesting bits have to do with BGP neighbors (also anycast.) I'll get to those later.
Hub DMVPN Interface
Each hub ran a separate IP subnet on its DMVPN interface. This means that I needed several of the static interface routes for DHCP as described at the end of the previous post. One of them made DHCP work, while the rest were superfluous, but at least they were correct.
The hub's DMVPN interface sourced the tunnel from this loopback interface, and used LTE_TRANSIT for tunneled traffic. Each hub uses a different IP subnet on this interface. The hub in this example is 203.0.113.1/27
interface tunnel 192020
ip address 203.0.113.1 255.255.255.224
tunnel source loopback 192020
tunnel vrf LTE_TRANSIT
Hub BGP Interface
The hub/spoke routing protocol in this environment is eBGP for various reasons. Ordinarily I'd have the spokes talk to the hub using the hub's address in the DMVPN subnet. That's not possible here because the spoke doesn't know the hubs address, because he doesn't know which hub he's using. anycast to the rescue again!
The hubs each have a loopback at 198.51.100.0/32 in the global routing table (GRT is used for DMVPN - no IVRF here). Spokes are configured to look for a BGP neighbor at this address. There's a problem here: The spoke's BGP neighbor isn't directly connected and the spoke doesn't yet know how to reach this it.
interface loopback 198511000
description anycast target for spoke BGP sessions
ip address 198.51.100.0 255.255.255.255
!
router bgp 65000
bgp listen range 203.0.113.0/27 peer-group PG_SPOKES
DHCP Service
Each hub has a DHCP pool configured to represent the local DMVPN interface. For example, this router is 203.0.113.1/27, so it uses the following pool configuration:
ip dhcp pool DMVPN_POOL
network 203.0.113.0 255.255.255.224
option 121 hex 20c6.3364.00cb.0071.01
Option 121 specifies a classless static route to be used by DHCP clients. This is the mechanism by which the spokes find their BGP neighbor at 198.51.100.0. Breaking down the hex, we have:
- 0x20 = 32 -- This is the prefix length of the route. A host route.
- 0xc6336400 = 198.51.100.0 -- This is the target prefix of the route, as bounded by the length above. Note that this field is not always 4 bytes long. The RFC authors did this cute/maddening thing where the length of this field depends on the value of the prefix length byte. Ugh.
- 0xcb007101 = 203.0.113.1 -- The next-hop IP for the route. Hey, that's this hub's tunnel address!
Now, no matter which hub a spoke attaches to, the spoke will find an off-net BGP neighbor at 198.51.100.0, and will have a static route (assigned by BGP) to ensure reachability to that neighbor.
Spoke BGP
The spoke routers use the 'disable-connected-check' feature to talk to the BGP anycast interface on the hub while still using TTL=1:
The spoke routers use the 'disable-connected-check' feature to talk to the BGP anycast interface on the hub while still using TTL=1:
router bgp 65001
neighbor 198.51.100.0 remote-as 65000
neighbor 198.51.100.0 disable-connected-check
neighbor 198.51.100.0 update-source Tunnel0
Remaining challenge
The spokes are behind LTE-enabled NAT routers because there's no Cisco hardware available with the correct LTE bands.
Ordinarily, the LTE-assigned IP address won't change with mobility, but it does change if the EPC which owns the client's address is shut down. In those cases, I found the spokes re-established connections with the now-nearest DMVPN hub right away, but the spoke's tunnel interface held onto the old DHCP address.
The if-state nhrp command might have taken care of this2, but I've had some bad experiences and don't entirely trust it. I used EEM instead:
event manager applet BGP_NEIGHBOR_198.51.100.0_DOWN
event snmp oid 1.3.6.1.2.1.15.3.1.2.198.51.100.0 get-type exact entry-op ne entry-val "6" exit-op eq exit-val "6" poll-interval 10
action 1.0 syslog priority errors msg "BGP with 198.51.100.0 down, bouncing Tunnel 0..."
action 2.0 cli command "enable"
action 2.1 cli command "configure terminal"
action 3.0 cli command "interface tunnel 0"
action 3.1 cli command "shutdown"
action 4.0 cli command "do clear crypto isakmp sa"
action 4.1 cli command "do clear crypto ipsec"
action 5.0 cli command "interface tunnel 0"
action 5.1 cli command "no shutdown"
1 Spoke routers were wireless devices on a private band LTE network. Hub routers were physically located with LTE equipment, very close to where the LTE Evolved Packet Core hands off to the IP network. There's no opportunity for the "nearest" DMVPN hub to change from one site to another without the spoke losing its IP address on the LTE network.↩
2 Actually, I'm completely unsure what if-state nhrp will do with a dynamically assigned tunnel address. DHCP can't happen until the interface comes up, and the interface can't come up without NHRP registration, which requires an address... ↩