With the release of Red Hat Enterprise Linux 8, the eXpress Data Path (XDP) functionality in the kernel is enabled as tech preview, which makes it possible to experiment with it on top of a RHEL system. XDP makes it possible to integrate flexible programmable network packet processing directly into the Linux kernel data path, with very high performance. In this blog post, we will take a quick look at what this technology offers, and how you can get started with it.

First a note of caution, though: The tech preview status reflects the fact that XDP is still fairly new technology, and still has a way to go before it becomes a turnkey solution for networking. As such, we are mainly targeting developers with this tech preview, so expect code examples ahead.

The anatomy of an XDP program

XDP works by executing a user-supplied program when a packet is received on a network interface. This happens directly in the device driver, before the kernel touches the packet data, which results in both high performance (because the program can make a decision about a given packet after only the minimum necessary processing), and lots of flexibility (because the program is able to make arbitrary modifications to the packet).

XDP programs are run using the eBPF virtual machine which is also used for several other purposes in the kernel. While one could conceivably write a packet processing program directly in the eBPF byte code format, more commonly programs are written in C and compiled (using the LLVM compiler) to the byte code format before being loaded into the kernel. When loading the program, the kernel verifier will decide if the program is safe and does not harm the system by, e.g., performing out of bounds memory accesses.

An XDP program starts execution with a pointer to an execution context object, which in turn contains pointers to the packet data, and a bit of metadata. The program ends by setting a return code which is used to decide the final packet verdict. A minimal XDP program that simply drops all packets would look like this:

#include <linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))

SEC("prog")
int  xdp_prog_simple(struct xdp_md *ctx)
{
return XDP_DROP;
}

char _license[] SEC("license") = "GPL";

The SEC() macro is used to add annotations which instruct the linker to place sections of code or data in particular named sections of the ELF file produced by the compilation and linking of the source code. These section names can then be recognised by the loader that subsequently loads the program into the kernel. And the program itself simply ignores its context and signals the kernel to always drop the packet, by returning the XDP_DROP return code.

But we are getting ahead of ourselves; before the program sample we've seen can actually be compiled and loaded, we will need to install a few dependencies and get set up for testing.

Setting up an environment for XDP

To be able to compile and run XDP programs we'll need a few programs and libraries. These are the clang compiler and the linker from the LLVM compiler suite, as well as the kernel header files. The bpftool utility can also be useful to list installed XDP programs, so we'll install that as well.

Install these as follows:

$ sudo yum install clang llvm kernel-headers bpftool

We will only cover a few examples here, but for those wanting to continue, we have an online tutorial that will cover some additional basics of eBPF usage, as well as more advanced examples. I'd highly recommend going through the tutorial after reading this post, so let's clone the repository for that as well:

$ git clone --recurse-submodules https://github.com/xdp-project/xdp-tutorial.git

Finally, we will need a network interface with XDP support to actually load the XDP program onto. XDP support is driver-dependent. As of this writing, it is supported in the Broadcom bnxt driver, the Cavium thunderx driver, the Intel ixgbe, ixgbevf and i40e drivers, the Mellanox mlx4 and mlx5 drivers, the Netronome nfp driver and the Qlogic qede driver.

If you don't have any hardware using any of those drivers, you can also use a virtual Ethernet device (veth) to test on. The latter is the approach taken in the XDP tutorial; have a look at the test environment script from the tutorial if you want to use this. In the following examples, I'll just use $IFACE as a placeholder for whichever interface you want to install the XDP program on.

Compiling and installing the example XDP program

Once we have installed then necessary tools, we can compile the example XDP program shown previously; save it to the file xdp-example.c and run this command:

$ clang -O2 -Wall -target bpf -g -c xdp-example.c -o xdp-example.o

This will produce an ELF file in xdp-example.o, which we can verify with the file utility:

$ file xdp-example.o
xdp-example.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), with debug_info, not stripped

The unknown arch 0xf7 error is because the version of 'file' doesn't know about the BPF format. However, we can still use the llvm-objdump utility to inspect the contents:

$ llvm-objdump -S -no-show-raw-insn xdp-example.o
xdp-example.o: file format ELF64-BPF
Disassembly of section prog:
xdp_prog_simple:
; {
     0: r0 = 1
; return XDP_DROP;
     1: exit

Finally, we can load the program onto an interface using the ip tool. When you run this command you will notice that nothing much happens, except that suddenly all packets received on the interface silently disappear. Warning: This includes your SSH connection if you are connected remotely to the machine, so be careful which interface you load the program on if you don't have another way to access the machine.

To load the program, run:

$ sudo ip link set dev $IFACE xdp obj xdp-example.o

Because the XDP program runs before the network stack even sees the packet, tools like tcpdump also cease to work; packets just disappear silently. The tutorial contains some pointers to what you can do instead if you want to keep track of what an XDP program is doing, but we won't cover that here.

To verify that the program is loaded, we can also use the ip tool:

$ ip link show dev $IFACE
17: $IFACE: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
  link/ether e2:80:b8:84:10:de brd ff:ff:ff:ff:ff:ff
  prog/xdp id 23 tag 57cd311f2e27366b jited

The interface number and XDP program ID will probably be different on your machine, but you should be able to see the xdp marker with the interface flags, and the XDP program ID. You can use the bpftool utility to look for the program ID:

$ sudo bpftool prog list
23: xdp  tag 57cd311f2e27366b  gpl
loaded_at 2019-05-03T21:22:48+0200  uid 0
xlated 16B  jited 64B memlock 4096B

Finally, you can unload the program also using the ip utility:

$ ip link set dev $IFACE xdp off

Diving deeper: An example of a packet parsing program

If you want to dive deeper into the workings of XDP, as mentioned above, the online tutorial is a great way to do this. However, to give a taste for what a more complete program (that actually parses the packet data) might look like, have a look at the example below. It won't compile as it is here because it is missing some helper functions; but you can find the full program in the packet02-rewriting lesson of the tutorial - although I'll recommend that you go through the previous lessons first.

This program will parse the Ethernet header, then look for either IPv6 or IPv4 headers, followed by their respective ICMP headers. If it finds an ICMP echo request packet (as sent by the regular ping utility), it will inspect the sequence number, and drop even-numbered pings. This results in ping output like this:

PING fc00:dead:cafe:1::1(fc00:dead:cafe:1::1) 56 data bytes
64 bytes from fc00:dead:cafe:1::1: icmp_seq=1 ttl=64 time=0.046 ms
64 bytes from fc00:dead:cafe:1::1: icmp_seq=3 ttl=64 time=0.086 ms
64 bytes from fc00:dead:cafe:1::1: icmp_seq=5 ttl=64 time=0.109 ms
^C
--- fc00:dead:cafe:1::1 ping statistics ---
5 packets transmitted, 3 received, 40% packet loss, time 67ms
rtt min/avg/max/mdev = 0.046/0.080/0.109/0.027 ms

SEC("xdp_packet_parser")
int  xdp_parser_func(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;

/* Default action XDP_PASS, imply everything we couldn't parse, or that
* we don't want to deal with, we just pass up the stack and let the
* kernel deal with it.
*/
__u32 action = XDP_PASS; /* Default action */

       /* These keep track of the next header type and iterator pointer */
struct hdr_cursor nh;
int nh_type;
       nh.pos = data;

struct ethhdr *eth;

/* Packet parsing in steps: Get each header one at a time, aborting if
* parsing fails. Each helper function does sanity checking (is the
* header type in the packet correct?), and bounds checking.
*/
nh_type = parse_ethhdr(&nh, data_end, &eth);

       if (nh_type == ETH_P_IPV6) {
               struct ipv6hdr *ip6h;
               struct icmp6hdr *icmp6h;

               nh_type = parse_ip6hdr(&nh, data_end, &ip6h);
               if (nh_type != IPPROTO_ICMPV6)
                       goto out;

               nh_type = parse_icmp6hdr(&nh, data_end, &icmp6h);
               if (nh_type != ICMPV6_ECHO_REQUEST)
                       goto out;

               if (bpf_ntohs(icmp6h->icmp6_sequence) % 2 == 0)
                       action = XDP_DROP;

       } else if (nh_type == ETH_P_IP) {
               struct iphdr *iph;
               struct icmphdr *icmph;

               nh_type = parse_iphdr(&nh, data_end, &iph);
               if (nh_type != IPPROTO_ICMP)
                       goto out;

               nh_type = parse_icmphdr(&nh, data_end, &icmph);
               if (nh_type != ICMP_ECHO)
                       goto out;

               if (bpf_ntohs(icmph->un.echo.sequence) % 2 == 0)
                       action = XDP_DROP;
       }
out:
return xdp_stats_record_action(ctx, action);
}

Summary: Go try out XDP!

Hopefully the above primer has given you some idea of how XDP works, and what is possible with it. Since XDP is relative low-level infrastructure, it affords many possibilities for new applications to be built on top of it, or for existing applications to be improved with XDP support. We have already seen some examples of this, mainly for load balancing and DDOS protection, but that is surely just the beginning. It will be exciting to see what others come up with!

So go forth, experiment with XDP for your use case, or invent entirely new ones for it! And provide feedback to the community so we can discover what is good, what we can still improve, and what is missing entirely. We are committed to continuing to evolve the technology in the future.

As you continue your exploration of XDP, you may find the following resources useful: