Virtual Extensible Local Area Network (VXLAN): The bridge to a Flexible and Robust Data Center Network

Munachimso Victor Nwaiwu

17 Feb 2025 • 9 min read

Traditional Layer 2 and Layer 3 networks, along with multicast technologies, play a crucial role in the data center. But what if we could extend those Layer 2 domains beyond the confines of a single location? In this blog, we'll explore how VXLAN (Virtual Extensible LAN) overcomes the limitations of VLANs and enables seamless Layer 2 connectivity across geographical boundaries. Discover how this technology unlocks data center cloud flexibility and revolutionizes network virtualization! Stay tuned to learn more.

History of VLANs

In the early days of Ethernet switching, all ports on a physical switch belonged to the same broadcast domain. This means that all ports on those early switches were part of the same Layer 2 domain.. This meant that if you had 100 ports in a switch, they all belonged to the same Local Area Network (LAN). It became inefficient to break up the network into segments for reasons such as security. For instance, if a company wanted to isolate its file servers from its web servers and place them in different LANs, the company would need to purchase multiple switches. This was not only expensive but also difficult to manage. With the introduction of Virtual Local Area Networks (VLANs), switches gained the ability to divide a single physical LAN into multiple virtual LANs. This was actually one of the first implementations of virtualization! VLANs improve security by isolating traffic and limiting the scope of broadcast traffic. This way, you could isolate different portions of the network, and if devices in different VLANs needed to communicate, the traffic needed to be routed via the default gateway. Now, in large networks, such as those with numerous switches due to a high device count, VLANs support trunking, enabling multiple switches to connect and share VLANs. For example, if you had 5 switches with 100 ports each, that is a total of 500 ports. You could configure 50 ports from every switch to be in the same VLAN and 50 ports in another VLAN. Devices in the same VLAN but different switches would be able to communicate without routing, but devices in different VLANs, irrespective of the switch, need to be routed to communicate.

Limitations of VLANs and Layer 3 Routing: Why VXLAN?

With Layer 2 switching within a VLAN being very fast, L2 switching has a lot of downsides, which include Spanning Tree Protocol for loop prevention, which blocks ports, and although it could be optimized using some techniques, the overhead and complexity sometimes isn’t justifiable. Furthermore, the expanding size of broadcast domains leads to increased overhead and excessive broadcast traffic, impacting network performance. With these limitations, it was always a best practice to minimize the L2 footprint and implement Layer 3 routing. Layer 3 routing is much more intelligent, and issues like loop prevention are typically avoided with the implementation of routing protocols. Additionally, Layer 3 communication is essential for communication across geographically dispersed locations, a feat very difficult to achieve with Layer 2.

For a very long time, L2 was kept within the Local Area Network, and for communication between LANs, L3 routing was used. However, with the introduction of virtualization and cloud computing, where VMs need to be flexible across locations, managing VMs across locations became challenging because it would necessitate a change in subnet, which equates to a change in IP address information. To address this, VXLAN was invented. VXLAN is a method of extending a Local Area Network across geographical locations. Instead of having two subnets and routing between them, we could have one subnet but spread across different locations. This is accomplished by building a virtual network on top of the Layer 3 routing network. With VXLAN, we would be able to have VMs move from one location to a different location, maintaining IP address information as the subnet would remain consistent. Critically, VXLAN also allows engineers to build virtual network designs that are independent of the actual physical infrastructure. This decoupling means we can easily tear down and build new networks as the location of devices becomes inconsequential.

VXLAN as a Solution

To overcome the limitations of traditional VLANs, Cisco and other leading vendors proposed VXLAN (Virtual Extensible LAN) to the IETF. VXLAN was designed to enhance data center flexibility and scalability, addressing the inherent constraints of the 4094 VLAN limit.

Defining VXLAN

In essence, VXLAN is a network virtualization technology that overlays Layer 2 Ethernet frames onto a Layer 3 UDP network. It encapsulates original Layer 2 frames within UDP packets, allowing them to traverse Layer 3 networks, extending Layer 2 domains across physical boundaries.

VXLAN leverages Layer 3 routing and ECMP for load balancing. Unlike VLANs, VXLAN utilizes a 24-bit VXLAN Network Identifier (VNI), supporting approximately 16 million logical network segments, significantly enhancing scalability. Moreover, VXLAN's ability to extend Layer 2 domains over Layer 3 allows for seamless virtual machine mobility and infrastructure independence.

What is this "Virtual Network"?

As we've discussed, VXLAN enables us to construct a network on top of a traditional Layer 3 routed network. This virtual network is referred to as the Overlay Network, while the underlying network, which relies on the physical infrastructure, is called the Underlay Network. Crucially, the Overlay Network operates logically independently of the Underlay Network's specific topology, although it does rely on the Underlay's reachability. In essence, while the Overlay utilizes the Underlay for transport, it abstracts the complexities of the physical network, allowing for greater flexibility and agility in network design and deployment.

How Does VXLAN Preserve L2 Frames and Make L2 Communication Possible?

VXLAN uses encapsulation to preserve the Layer 2 data frame as it traverses across regions and L3 networks. It makes use of MAC-in-UDP encapsulation where the L2 data frame is wrapped. It first attaches an 8-byte VXLAN header, which comprises the 24-bit VNID (analogous to VLAN ID but for VXLAN and offering significantly greater scalability) and some reserved bits. The VXLAN and Ethernet frame is now encapsulated in the UDP payload. This UDP port for VXLAN is 4789. The UDP payload is then wrapped with an outer IP header and MAC header before being shipped.

The VXLAN Tunnel Endpoint (VTEP) is responsible for the encapsulation process. The VTEP usually uses the underlay network to determine the MAC and IP headers before sending it out. The VTEP is also responsible for learning the location of remote VTEPs and learns MAC address to VTEP mappings. That means that a device can send a frame with the MAC address of a device in another physical network but the same VXLAN; the VTEP is responsible for mapping that this MAC address belongs in my VXLAN, but in a different physical region, and would therefore know the VTEP that would have access to this MAC address and encapsulate based on details to get to the required VTEP. The receiving VTEP is then responsible for de-encapsulating the frame and removing the extra headers and then forwarding the original Layer 2 frame to the intended recipient, which could be a switch in the VTEPs LAN. The VTEP has to have at least two interfaces, one to the local LAN and one to the transport IP network. The default gateway usually plays this role.

How Do VTEPs Learn the Location of Other VTEPs?

This is actually a very interesting and detailed process. It is good to state that for VXLAN and VTEPs to be successful, the underlay network has to have a means of propagating network information, as this is the means the VTEPs would use to learn their different locations.

Firstly, during configuration, the VNID to multicast group mappings is created. This is done mostly through manual configurations, but it could also be done through centralized controllers; what is important is that the mapping must be consistent through all VTEPs.

Now with this setup, let us put in mind that the goal is for the VTEP to know the IP address of the VTEP to send a MAC address destination, i.e., VTEP IP to VXLAN Member MAC.

So after the VNID to multicast group mappings is created, there are four ways VTEPs can learn other VTEPs' destinations. They include:

Static Configuration
Data Plane Learning
Data Plane Learning with BIDIR-PIM
Control Plane Learning with MP-BGP EVPN

The most efficient way for VTEPs to learn the reachability information of other VTEPs in modern deployments is through Control Plane Learning with MP-BGP EVPN.

Control Plane Learning with MP-BGP EVPN

In this method, the data plane, responsible for traffic forwarding, is separated from the control plane, which handles route learning. MP-BGP EVPN is used as the control plane protocol for learning routes, and L3 unicast routing is used for the actual forwarding of traffic. Therefore, it largely eliminates the need for multicast forwarding, unlike data plane learning.

EVPN stands for Ethernet Virtual Private Network. This concept involves creating a virtual Layer 2 overlay network on top of an IP underlay network. Multi-Protocol BGP is used to transmit MAC address and VTEP information between VTEPs. VTEPs use BGP to advertise MAC and IP address reachability information to all other VTEPs in the network. This reduces the need for flooding and multicast, as the devices already know their destinations and directly use the underlay IP unicast routing to send packets to the required VTEP.

Older Methods with Limitations

Static Configuration: This is when we manually configure VTEPs with the VTEP IP to MAC address mappings.
Data Plane Learning: With this process, the VTEP relies on traffic flow to build its VTEP IP to MAC address mappings. When the VTEP receives a MAC address with an unknown destination, its next step is to flood the packet to find the owner. In this case, it makes use of the underlay network’s multicast trees to perform flooding. It uses the VTEP to multicast group mapping configured on the VTEP to determine which multicast group to flood the traffic to. This is a method of controlled flooding where it doesn’t just send to unrelated VTEPs with different VNIDs. It sends to just VTEPs that have subscribed to the multicast group based on having the same VNID. When each VTEP receives a packet, it checks if it has the belonging MAC address in its LAN; if it doesn't, it drops the data packet.
Data Plane Learning with BIDIR-PIM: This method of learning is an optimization of data plane learning. For this, BIDIR-PIM is used to create shared bidirectional multicast trees for effective and more optimal flooding when learning the VTEP to MAC address mappings. This prevents the overlay network from relying on the existing multicast setup in the underlay network; instead, it builds its own optimal multicast trees. Except for this change, every other process with data plane learning remains the same.

Are VXLANs Backwards Compatible with Traditional VLANs?

Yes, VXLANs are designed to interoperate with traditional VLANs. In a pure VXLAN environment where Layer 2 networks are extended across physical locations, we use the Virtual Network Identifier (VNI) as the identifier for Layer 2 networks. This identifier is used on the overlay network to extend Layer 2 capability. To extend a traditional LAN using VLAN IDs to a VXLAN environment, VXLAN uses a Layer 2 Gateway. This function is carried out by the VTEP, and it maps the traditional VLAN ID to a VNI. Thus, the VTEP swaps the VLAN ID for the VNI when forwarding traffic through the overlay. In a pure VXLAN environment, end devices like servers only need the VNI configured. However, in an environment that has VLANs extending to the VXLAN, either a VLAN ID or a VNI can be configured, as the VTEP will have the mapping. Most importantly, consistency in configuration is necessary. It is good to note that every L2 Gateway is a VTEP, but not every VTEP is a Layer 2 Gateway.

How do VXLANs with Different VNIs Communicate?

Similar to how traditional gateways facilitate communication between different Layer 2 networks via Layer 3 routing, this principle remains consistent within the VXLAN environment. For different VNIs to communicate, they need to go through a VXLAN-capable Layer 3 gateway to route the traffic. Now, in extended VXLAN deployments, anycast can be configured on the routed interfaces of all VTEPs to serve as a default gateway. This provides an exit route out of the Layer 2 VNI at each physical location, preventing devices from needing to traverse the overlay network via EVPN before being routed.

How Does Tenant Routed Multicast (TRM) Make VXLAN Environments Better?

TRM enhances a standard MP-BGP VXLAN network by incorporating multicast into both the control plane and the data plane. It adds multicast data, such as source and receiver information, to the MP-BGP control plane to efficiently distribute this information. VTEPs often act as RPs, optimizing multicast routing. The key benefit of using multicast is its efficiency in replicating traffic to multiple receivers, especially across different VTEPs. This is crucial for both Layer 2 and Layer 3 multicast traffic. With traditional VXLAN networks, if a multicast message needs to be sent across VTEPs, a multicast table would be built upon the overlay with distribution trees built and RPs set, but with TRM, it uses the existing multicast infrastructure in the underlay. It includes each multicast group sender and receiver information from the underlay into the MP-BGP EVPN route advertisements. This means that it not only shares MAC to VTEP IP information but also includes multicast information. With this information known across all VTEPs, TRM makes all VTEPs Rendezvous points, implementing a distributed RP architecture. This allows for efficient multicast routing. With all these in place, when an end device forwards multicast traffic to the VTEP, the VTEP consults its database for learned multicast information and uses the underlay multicast trees to forward traffic to its intended user. Importantly, TRM is tenant-aware, operating within a VRF to provide isolation and control for multicast traffic.

Conclusion

In conclusion, we've explored the evolution of network virtualization from VLANs to VXLAN, highlighting the limitations of traditional approaches and the advantages of VXLAN in overcoming them. We've delved into the mechanisms of VXLAN, including encapsulation, VTEP communication, and the role of the underlay network. Furthermore, we've examined how TRM optimizes multicast traffic within VXLAN environments, enhancing efficiency and scalability. By understanding these technologies, network engineers can build more agile and robust data center and cloud networks that meet the demands of today's dynamic workloads.