My first experience with an overlay tunnel happened back in 2003 when I was working on a project to create a transparent proxy based on Squid and the Cisco WCCP (Web Cache Communication protocol). Part of the configuration between the Cisco router and the Squid proxy needed to use GRE (Generic Routing Encapsulation) tunnels for communication. Back then, I did not fully understand the need for tunnel protocols but fast forward to the present and now I know their importance because VLANs in large multi-tenant clouds are broken.
VLANs were developed to segment network traffic into smaller, less complex networks. The shortcoming is that they only have a fixed 12-bit field which means you can only have about 4000 VLANs in a network topology. Back in the late 1990s and early 2000s, this was more than enough segments to accommodate networking. However, with the dawning of the cloud age and multi-tenant environments, the need for many more individual network tunnels arose.
To approach the issue, various vendors have proposed various solutions: VXLAN (Virtual Extensible LAN), NVGRE (Network Virtualization using Generic Routing Encapsulation) and STT (Stateless Transport Tunneling). All three encapsulate application data in a new larger fixed header field. That header size is 24-bit for VXLAN and NVGRE, the latter being used mostly by Microsoft, while STT has a 64-bit header size. None of these encapsulation tunnelling methods require any change to hardware networking infrastructure, though some vendors offer hardware that can help assist in accelerating the efficiency of the solution. However, none of the solutions are compatible with each other.
All is not lost however in the current age of large multi-tenant clouds. A new network virtualization standard has emerged: GENEVE (Generic Network Virtualization Encapsulation) which promises to address the perceived limitations of the earlier specifications and support all of the capabilities of VXLAN, NVGRE and STT. Many believe GENEVE could eventually replace these earlier formats entirely.
The stated goal of GENEVE is to define an encapsulation data format only. Unlike the earlier formats, it does not include any information or specification for the control plane. The authors state:
"There is a clear advantage in settling on a data format: most of the protocols are only superficially different and there is little advantage in duplicating effort. However, the same cannot be said of control planes, which are diverse in very fundamental ways. The case for standardization is also less clear given the wide variety in requirements, goals and deployment scenarios."
To achieve these goals, the GENEVE authors agreed that the data format should be as flexible and extensible as possible. While the current 24-bit tunnel identifier fields in VXLAN and NVGRE and 64 bits in STT are more than sufficient to specify all of the virtual networks that will be required, they expect that future developers will want to subdivide this field to carry information other than the virtual network identifier. They compare potential uses of this field to the system state information currently exchanged within virtualized servers or between line cards in a chassis switch. They observe that no fixed field size can be specified that will be sufficient for all possible future uses. Further, the authors of GENEVE take their cue from many other protocols that have shown themselves to have a long life. Protocols like BGP (Border Gateway Protocol), LLDP (Link Layer Discovery Protocol), IS-IS (Intermediate System - Intermediate System) and many others have been around for multiple decades and are still as popular as they have ever been.
And the reason is simple: they are extensible. They evolve over time with new capabilities, not by revising the base protocols, but by adding new optional capabilities.
GENEVE encapsulated packets are designed to be transmitted via standard networking equipment. Packets are sent from one tunnel endpoint to one or more tunnel endpoints using either unicast or multicast addressing. The client application and the host in which it is executing are not modified in any way. Applications generate identical IP packets as if they were communicating via hardware switches and routers. The destination IP address included in the packet is significant only within the cloud tenant's virtual network. The tunnel endpoint then encapsulates the end-user IP packet in the GENEVE header, adding the tunnel identifier specifying the tenant's virtual network followed by any options. The header consists of fields specifying that it is a GENEVE packet, the overall length of the options if any, the tunnel identifier and the series of options. The completed packet is then transmitted to the destination endpoint in a standard UDP packet which is supported via IPv4 and IPv6. The receiving tunnel endpoint strips off the header, interprets any included options and directs the end-user packet to its destination within the virtual network indicated by the tunnel identifier.
The GENEVE specification offers recommendations on ways to achieve efficient operation by avoiding fragmentation and taking advantage of ECMP (Equal-cost multi-path) and NIC hardware offload facilities. The specification also offers options on how to support differentiated services and explicit congestion notification. The authors do not expect problems when GENEVE and one or more of the other encapsulation methods are in use on the same system. GENEVE tunnel endpoints will communicate only with each other and packets are handled by the network infrastructure identically to any other UDP packet. The data format supports all of the capabilities of VXLAN, NVGRE and STT, so eventually use of the three earlier formats may decline. Since a control plane protocol isn’t specified, the authors expect it to support any protocol in use with the other encapsulation methods. One key benefit over other encapsulation methods is GENEVE's flexible option format and use of IANA (Internet Assigned Numbers Authority) to designate Option Classes. Developers can include as many or as few options dynamically without being limited to 24-bits or incurring the overhead of a 64-bit field when only a fraction of that size is needed.
Transition to GENEVE will not be immediate. The other encapsulation methods have been in use for some time, and multiple methods can operate within the same system. However, GENEVE is being adopted as the default tunnelling protocol for OVN (Open Virtual Network) which in turn is being promoted as an implementation of OVS (OpenvSwitch) in future OpenStack releases.
Experience with large multi-tenant clouds continues to grow and no single encapsulation method may become the accepted standard. However GENEVE, with its flexible option format and support for all of the capabilities of the other methods, will be a strong candidate for wide adoption.
Want to learn more about edge computing?
Edge computing is in use today across many industries, including telecommunications, manufacturing, transportation, and utilities. Visit our resources to see how Red Hat's bringing connectivity out to the edge.
Benjamin Schmaus is a Red Hat Cloud TAM in the NA Central region. He has been involved with Linux since 1998 and has supported business environments in a variety of industries: retail, defense, software, financial, higher education and lower education. Most recently, he has been focused on enabling our customers in deploying, operating and supporting Red Hat OpenStack Platform and Red Hat Ceph Storage.
A Red Hat Technical Account Manager (TAM) is a specialized product expert who works collaboratively with IT organizations to strategically plan for successful deployments and help realize optimal performance and growth. The TAM is part of Red Hat’s world class Customer Experience and Engagement organization and provides proactive advice and guidance to help you identify and address potential problems before they occur. Should a problem arise, your TAM will own the issue and engage the best resources to resolve it as quickly as possible with minimal disruption to your business.