In this post we're going to take a look at how to capture and examine network traffic using xdpdump and Wireshark to troubleshoot eXpress Data Path (XDP) issues on Red Hat Enterprise Linux (RHEL).
What is the eXpress Data Path?
In short, this is a Linux kernel networking feature that allows an extended BPF (eBPF) program to run right before the network device hands it off to the kernel networking stack. This eBPF program can also make modifications to the packet and decide what happens next (for example, continue kernel processing, drop the packet, or redirect it to another interface). For more details on XDP and eBPF, take a look at the following posts:
Why do we need xdpdump?
What is wrong with using tcpdump for capturing network traffic in this environment? If you are interested in what the kernel sees, using tcpdump makes perfect sense. However, as explained above, an XDP program can modify the packet before passing it on to the kernel. In this case, the packet's content seen by the Linux kernel is not the same as what’s received on the wire. An XDP program can also decide to drop or redirect the packet, in which case the kernel—hence tcpdump—does not see the packet at all.
Here is where xdpdump fits in: it can be used for debugging XDP programs that are already loaded on an interface. Packets can be dumped and inspected before, on entry to XDP program, and after, at exit from an XDP program. Furthermore, at exit, the XDP action is also captured. This means that even packets that are dropped or redirected at the XDP layer can be captured via xdpdump.
How does xdpdump work?
The Linux kernel has a general framework that allows an eBPF program to be attached to any kernel function entry or exit point. Recently, this has been extended to allow attachment to another eBPF program. xdpdump uses this to gain access to the packet resources right before and after the XDP program is executed, which allows xdpdump to capture the original packet at ingress and the modifications the XDP program has made to the packet after execution. xdpdump will capture and store the following information by default:
- The receiving interface.
- The receiving interface queue.
- The original packet.
- The packet after the XDP program was executed.
- The XDP program's return code, i.e. XDP_DROP, XDP_PASS, etc.
- A unique packet identifier to match the original packet to the packet after execution.
How to use xdpdump
First of all, you need to get a copy of xdpdump, which is part of the xdp-tools project on GitHub. Some distributions like Fedora and Red Hat Enterprise Linux 8.3 release support xdp-tools as a package. Once installed, xdpdump has the following options:
$ ./xdpdump -h Usage: xdpdump [options] XDPDump tool to dump network traffic Options: --rx-capture <mode> Capture point for the rx direction (valid values: entry,exit) -D, --list-interfaces Print the list of available interfaces -i, --interface <ifname> Name of interface to capture on --perf-wakeup <events> Wake up xdpdump every <events> packets -p, --program-names <prog> Specific program to attach to -P, --promiscuous-mode Open interface in promiscuous mode -s, --snapshot-length <snaplen> Minimum bytes of packet to capture --use-pcap Use legacy pcap format for XDP traces -w, --write <file> Write raw packets to pcap file -x, --hex Print the full packet in hex -v, --verbose Enable verbose logging (-vv: more verbose) --version Display version information -h, --help Show this help
Here are some of the options that are not self-explanatory. For the full list, check out the man page:
Specify where the ingress packet gets captured. Either at the entry of the XDP program and/or exit of the XDP program. Valid options are entry, exit, or both entry,exit. The packet at exit can be modified by the XDP program. If you are interested in seeing both the original and modified packet, use the entry,exit option. With this, each packet is captured twice. The default value for this is entry.
Let the kernel wake up xdpdump once for every event posted in the perf ring buffer. The higher the number, the lesser the impact on the actual XDP program. The default value is 0, which automatically calculates the value based on the available CPUs/buffers. Use -v to see the actual used value.
The Linux API does not provide the full name of the attached eBPF entry function if it's longer than 15 characters. xdpdump will try to guess the correct function name from the available BTF debug information. However, if multiple functions exist with the same leading name, it cannot pick the correct one. It will dump the available functions, and you can choose the correct one and supply it with this option.
Use legacy pcap format for XDP traces. By default, it will use the PcapNG format so that it can store various metadata.
The below will load the xdp-filter program, which is also part of xdp-tools, on eno1, but it does not do any actual filtering:
# xdp-filter load --mode skb eth0 # xdpdump -D if_index if_name XDP program entry function -------- ---------------- -------------------------------------------------- 1 lo <No XDP program loaded!> 2 eno1 xdp_dispatcher() xdpfilt_alw_all()
Now we can try xdpdump:
# ./xdpdump -i eno1 -x --rx-capture entry,exit listening on eno1, ingress XDP program ID 132 func xdp_dispatcher, capture mode entry/exit, capture size 262144 bytes 1601027763.302161077: @entry: packet size 60 bytes, captured 60 bytes on if_index 2, rx queue 0, id 1 0x0000: ff ff ff ff ff ff 00 02 d1 19 47 bd 08 06 00 01 ..........G..... 0x0010: 08 00 06 04 00 01 00 02 d1 19 47 bd ac 10 01 cb ..........G..... 0x0020: ff ff ff ff ff ff ac 10 01 cb 00 00 00 00 00 00 ................ 0x0030: 00 00 e0 00 00 00 05 00 00 00 04 00 ............ 1601027763.302167163: @exit[PASS]: packet size 60 bytes, captured 60 bytes on if_index 2, rx queue 0, id 1 0x0000: ff ff ff ff ff ff 00 02 d1 19 47 bd 08 06 00 01 ..........G..... 0x0010: 08 00 06 04 00 01 00 02 d1 19 47 bd ac 10 01 cb ..........G..... 0x0020: ff ff ff ff ff ff ac 10 01 cb 00 00 00 00 00 00 ................ 0x0030: 00 00 e0 00 00 00 05 00 00 00 04 00 ............ ^C 2 packets captured 0 packets dropped by perf ring
Above, you see the same packet captured twice, at ingress and after the XDP program has been executed. You also see the return code after execution, @exit[PASS]. The id value, id 1, can be used to correlate the packets if they are not processed in order, for example, when multiple CPUs process packets from the same NIC but from different queues.
Below are two more examples redirecting the capture file to tcpdump or tshark, which allows real-time packet decoding:
# xdpdump -i eno1 -w - | tcpdump -r - -n listening on eno1, ingress XDP program xdpfilt_dny_all, capture mode entry, capture size 262144 bytes reading from file -, link-type EN10MB (Ethernet) 15:55:09.075887 IP 192.168.122.1.40928 > 192.168.122.100.ssh: Flags [P.], seq 3857553815:3857553851, ack 3306438882, win 501, options [nop,nop,TS val 1997449167 ecr 1075234328], length 36 15:55:09.077756 IP 192.168.122.1.40928 > 192.168.122.100.ssh: Flags [.], ack 37, win 501, options [nop,nop,TS val 1997449169 ecr 1075244363], length 0 15:55:09.750230 IP 192.168.122.1.40928 > 192.168.122.100.ssh: Flags [P.], seq 36:72, ack 37, win 501, options [nop,nop,TS val 1997449842 ecr 1075244363], length 36 # xdpdump -i eno1 -w - | tshark -r - -n listening on eno1, ingress XDP program xdpfilt_dny_all, capture mode entry, capture size 262144 bytes 1 0.000000 192.168.122.1 → 192.168.122.100 SSH 102 Client: Encrypted packet (len=36) 2 0.000646 192.168.122.1 → 192.168.122.100 TCP 66 40158 → 22 [ACK] Seq=37 Ack=37 Win=1467 Len=0 TSval=1997621571 TSecr=1075416765 3 12.218164 192.168.122.1 → 192.168.122.100 SSH 102 Client: Encrypted packet (len=36)
Analyze capture files with Wireshark
First, let's just capture a bunch of packets through xdpdump:
# xdpdump -i eno1 --rx-capture entry,exit -w mycapture.pcapng listening on eno1, ingress XDP program ID 132 func xdp_dispatcher, capture mode entry/exit, capture size 262144 bytes ^C 186 packets captured 0 packets dropped by perf ring
Secondly, let's see what metadata xdpdump adds to the PcapNG file:
capinfos ~/mycapture.pcapng File name: /home/echaudron/pcapng/mycapture.pcapng File type: Wireshark/... - pcapng File encapsulation: Ethernet File timestamp precision: nanoseconds (9) Packet size limit: file hdr: (not set) Number of packets: 186 File size: 52kB Data size: 38kB Capture duration: 16.842188367 seconds First packet time: 2020-09-25 13:53:26.819854558 Last packet time: 2020-09-25 13:53:43.662042925 Data byte rate: 2,266 bytes/s Data bit rate: 18kbps Average packet size: 205.24 bytes Average packet rate: 11 packets/s SHA256: e4a617b29ad6cb91c8fac39dbaf11410063c2329b43ae596d785479c8478ff22 RIPEMD160: c8ff9b9a6aa7729e722d00da1fe65c7d4e3097a0 SHA1: c13c0a19b1cc3d5a7a967135524ba21fffca3806 Strict time order: True Capture hardware: x86_64 Capture oper-sys: Linux ebuild 5.8.7-200.fc32.x86_64 #1 SMP Mon Sep 7 15:26:10 UTC 2020 Capture application: xdpdump v1.0.1 Number of interfaces in file: 2 Interface #0 info: Name = eno1@fentry Description = eno1:xdp_dispatcher()@fentry Encapsulation = Ethernet (1 - ether) Hardware = driver: "e1000e", version: "3.2.6-k", fw-version: "0.6-4", rom-version: "", bus-info: "0000:00:1f.6" Speed = 1000000000 Capture length = 0 Time precision = nanoseconds (9) Time ticks per second = 1000000000 Time resolution = 0x09 Number of stat entries = 0 Number of packets = 93 Interface #1 info: Name = eno1@fexit Description = eno1:xdp_dispatcher()@fexit Encapsulation = Ethernet (1 - ether) Hardware = driver: "e1000e", version: "3.2.6-k", fw-version: "0.6-4", rom-version: "", bus-info: "0000:00:1f.6" Speed = 1000000000 Capture length = 0 Time precision = nanoseconds (9) Time ticks per second = 1000000000 Time resolution = 0x09 Number of stat entries = 0 Number of packets = 93
As you can see, it will add some general information about the host where it was captured and the xdpdump version. But more of interest are the two interfaces. There is one for packets that arrive on the interface, i.e., before the XDP program is executed (eno1@fentry) and one after the program was executed (eno1@fexit). The description also holds details on the XDP function name, eno1:xdp_dispatcher()@fentry.
Now, open up Wireshark v3.4 or later and explore the packets:
In the picture above, we have selected the second packet, which is an @exit packet. If you take a good look, you can identify all the metadata fields:
- Interface Name, Interface description: the receiving interface (and the comment with the XDP program function).
- Interface queue: the receiving interface's queue.
- Packet id: the unique packet ID.
- Verdict => eBP XDP: the XDP return code.
This is not where the functionality ends. You can use these fields to build packet filters. For example, the picture below shows how to filter packets with unique packet id 40:
- All XDP_REDIRECT or XDP_TX packets: frame.verdict.ebpf_xdp == 4 || frame.verdict.ebpf_xdp == 3
- All packets from a specific interface: frame.interface_name== "eno1@fexit"
- All packets from a specific interface queue: frame.interface_queue == 2
As you can see, xdpdump is a valuable tool, especially when troubleshooting XDP related issues. For example, when packets are not received on the interface, are they dropped by XDP? Or, why does my packet look odd (or is even rejected by the kernel)? Is this because XDP corrupted it?
This blog post is a tour of the current features of xdpdump, but we would like to add more features like native capture filter support in the future. xdpdump does not currently have filtering support found in tcpdump, but this is a feature on the roadmap. Keep checking back on the roadmap for updates on potential features for xdpdump.