The Mystery of the Anycast VTEP

Many vendors of networking equipment have introduced the combination of Multi-Chassis Link Aggregation (MLAG) in combination with an Anycast Virtual Tunnel Endpoint (VTEP) configured on both MLAG peers as a dual-homing solution for Virtual eXtensible Local Area Network (VXLAN) overlay networks. Some vendors provide a simple implementation requiring little configuration, without explicit connection between MLAG and Anycast VTEP.

How can this work and why is the combination of both features needed for VXLAN dual-homing?

(This article assumes some basic familiarity with anycast, VXLAN, and MLAG functionality.)

Frames Leaving VXLAN

VXLAN encapsulated packets sent to the Anycast VTEP reach only one of the MLAG peers, not both. This follows from the use of IP anycast. If the decapsulated frame is broadcast, unknown unicast, or multicast (often abbreviated as BUM) traffic, it is flooded normally (i.e., not to remote VTEPs).

Now we know why anycast addressing of VTEPs is required.

Frames Entering VXLAN

For the Anycast VTEP to work as expected, frames ingressing an MLAG peer switch via MLAG port or orphan port need to be treated differently than frames ingressing via peer link. Otherwise, two such MLAG switch pairs with Anycast VTEP pointing at each other would create a Layer 2 loop.

Now we know why MLAG functionality is required.

MLAG Ports and Orphan Ports

Frames ingressing an MLAG peer switch via local MLAG port or via orphan port destined to a remote VTEP are encapsulated on that MLAG peer. BUM traffic is flooded to all remote VTEPs and via peer link to the MLAG peer.

MLAG Peer Link

Frames ingressing an MLAG peer switch via peer link must not be sent to any remote VTEP, because the other MLAG peer has already done that (unless the MLAG peer's VTEP has failed). This is the secret sauce needed for dual-homing to a Layer 2 VPN (such as VXLAN) via MLAG.

Implicit Configuration

It would be possible to use explicit configuration to associate a local VTEP with MLAG operations. The respective MLAG protocol could carry this information and inform the MLAG peer about the state of the VTEP. But, at least some implementations do not use this method.

Routing Protocol Inference

One part of the mystery is using an anycast address for the VTEP. For anycast to work as expected, this anycast IP address should be advertised by both MLAG peer switches via routing protocol, preferably as a host address. If an MLAG switch learns via routing protocol that its MLAG peer advertises the same IP address as a host address as the MLAG switch uses for its own local VTEP, then it can assume the intention to use an Anycast VTEP with MLAG setup. Using a routing protocol automatically takes care of synchronizing the state information for the respective MLAG peer's Anycast VTEP.

Now we know how this can work without explicit additional configuration beyond adding a common (i.e., anycast) IP adress and a VTEP.

Of course, the MLAG peer switches might also be using other anycast functionality, e.g., an Anycast RP. This might affect the function of an as yet incompletely configured Anycast VTEP, if the same IP address is used for both functionalities.

Anycast VTEP as MLAG Port

If the Anycast VTEP is treated as an MLAG port, i.e., frames received via the MLAG peer link are not encapsulated and sent to remote VTEPs unless the MLAG peer's Anycast VTEP is down, the combination of MLAG with an Anycast VTEP can provide simple and thus robust dual-homing for VXLAN overlays. Of course, this MLAG port treatment should be complete, including all MLAG port specialties like reload delays or split brain mitigations, among others.

Generalization to Other Layer 2 VPNs

The basic idea behind the Anycast VTEP is to have two devices that use the same address, both for receiving and sending tunneled data. Only one device sends a given tunnel packet, and only one device receives a given tunnel packet. If one device (or its tunnel endpoint) fails, the other sends and receives every tunnel packet. Some kind of multi-path mechanism can provide load sharing when both devices are active.

This general idea behind the Anycast VTEP is independent of VXLAN, and could be used for other Layer 2 VPNs as well. If the VPN is based on IP transport, this works just as with VXLAN. If the VPN is based on some other transport, e.g., MPLS, then the idea of anycast addressing might need to be extended to this transport. For MPLS this seems to just work, as described by Ivan Pepelnjak in his blog post Anycast Works Just Fine with MPLS/LDP.

If the Layer 2 VPN needs to provide additional features, e.g., sequencing, this might require additional synchronization between the MLAG peers. (VXLAN does not provide sequencing of packets, and neither does VPLS, so this concerns only more featureful Layer 2 VPNs.)

At least one vendor uses the Anycast VTEP idea together with Shortest Path Bridging. I am not aware of any application of the Anycast VTEP idea to MPLS-based Layer 2 VPNs.

back to my homepage.