It must have been in late 2017 when Red Hat’s VPN servers were rebooted for a kernel upgrade. A few days later I was contacted by someone who knew that I am working on Checkpoint/Restore in Userspace (CRIU) and asked if CRIU could be used to avoid terminated VPN connections due to a reboot of the VPN server. Always happy to hear about interesting use cases for CRIU I thought that this sounds like great idea to try out and answered that it should be theoretically possible.
For those not familiar with CRIU, the goal of CRIU is to “freeze” an application or process, then save the files and restore them later to run as they were when frozen. This is to enable things like migrating containers or application snapshots, or to avoid terminating VPN connections!
Over the next few months I thought about this use case from time to time, but never actually tried it out, until now. It took me one afternoon to set up everything but now it actually works and this recording of my terminal session shows the result:
Now that we know that it actually works to update the kernel on an OpenVPN server without terminating the connections between clients and server I want to go into the details what I did to set this up.
My setup consists of two virtual machines running Red Hat Enterprise Linux 7.5 with OpenVPN installed from EPEL. CRIU is available as tech preview in Red Hat Enterprise Linux 7 as of 7.2. One VM is running local on my computer (client) and the other VM is running on a system about 10 kilometers away (server). After configuring OpenVPN I started it from the command-line with
$ openvpn --config server.conf
Without much preparation I tried to checkpoint the OpenVPN process using CRIU:
$ criu dump -t `pidof openvpn` -D /tmp/1
This failed pretty fast with CRIU complaining:
Error (criu/tun.c:276): Net namespace is required to dump tun link
It would have been nice if it just worked without any additional setup, but this sounds solvable. So let’s start OpenVPN in a network namespace:
$ unshare -n sh -c 'setsid /usr/sbin/openvpn --config server.conf &> /dev/null < /dev/null'
I am using ‘
unshare -n’ to run my command in a new network namespace. ‘
setsid’ and the stdin and stdout redirection (‘&> /dev/null < /dev/null') is used to make it easier for CRIU to dump the process.
Once OpenVPN is running in its network namespace I had to configure the network namespace so that the OpenVPN process in the network namespace can be used via the host system’s network:
$ ip link add veth0 type veth peer name veth1
$ ip link set veth1 netns `pidof openvpn`
$ nsenter -n -t `pidof openvpn` ip addr add 10.22.0.3/24 dev veth1
$ nsenter -n -t `pidof openvpn` ip route add default via 10.22.0.1
$ brctl addbr br0
$ ip addr add 10.22.0.1/24 dev br0
$ ip link veth0 up
$ brctl addif br0 veth0
$ ip link set up br0
$ echo 1 > /proc/sys/net/ipv4/ip_forward
$ iptables -t nat -I PREROUTING -i eth0 -p udp --dport 1194 -j DNAT --to-destination 10.22.0.2:1194
Now that the OpenVPN process is running in its network namespace and that this network namespace is configured to be reachable from the outside I can start the client:
$ openvpn --config client.conf
At the end of OpenVPN starting up I see the following output:
Tue Oct 16 07:06:29 2018 /sbin/ip addr add dev tun0 local 172.31.0.6 peer 172.31.0.5
Tue Oct 16 07:06:29 2018 /sbin/ip route add 172.31.0.0/24 via 172.31.0.5
Tue Oct 16 07:06:29 2018 Initialization Sequence Completed
And I can ping the OpenVPN server through the VPN tunnel using:
$ ping 172.31.0.1
PING 172.31.0.1 (172.31.0.1) 56(84) bytes of data.
64 bytes from 172.31.0.1: icmp_seq=1 ttl=64 time=15.6 ms
Up to this point I have configured a regular OpenVPN based VPN tunnel with the only exception that the server part of the connection is running in a network namespace. Now that all this works I can finally checkpoint and restore my OpenVPN server process.
$ mkdir /tmp/1
$ criu dump -t `pidof openvpn` -D /tmp/1 --ext-mount-map auto --enable-external-sharing --enable-external-masters
The CRIU parameters have the following meaning
dump - this tells CRIU to dump (checkpoint) the specified process
-t `pidof openvpn` - this gives CRIU the process ID (PID) of the process that should checkpointed
-D /tmp/1 - the directory CRIU should use to write the checkpoint images to
--ext-mount-map auto --enable-external-sharing --enable-external-masters - all those options are helping CRIU to correctly handle checkpointing a process in a network namespace. If using CRIU as part of a container runtime to checkpoint and restore a container, this would happen automatically.
CRIU will complain if something did not work, but if the checkpointing was successful it will write the checkpoint image to the specified directory. At this point the OpenVPN process will be gone and the ping command on the client side will stop receiving replies. The client side still thinks that the VPN tunnel is alive but the server process is gone.
The next step is to restore the checkpointed OpenVPN process:
$ criu restore -D /tmp/1 --external veth[veth1]:veth0@br0 -d
To restore the OpenVPN process following parameters are used:
restore - this tells CRIU to restore a process
-D /tmp/1 - the checkpoint image of the process to restore can be found in the directory /tmp/1
--external veth[veth1]:veth0@br0 - this option tells CRIU how to wire the network interface in the network namespace to the host’s network setup. This is basically the same as the manual setup of the network namespace when the VPN server process was initially started.
-d - this tells CRIU to run the restored process in the background
If CRIU succeeds to restore the process the OpenVPN server process will continue to run and the ping command on the client side should receive replies again.
Now the basic functionality, to checkpoint and restore the OpenVPN server process, is working. All that is left to do is combine the checkpointing and restoring with a reboot. I am using kexec to reboot as it should be faster which should reduce the downtime of the OpenVPN server process.
This tells the kernel which kernel image and which ramdisk it should use for the kexec reboot:
$ kexec -l /boot/vmlinuz-3.10.0-862.14.4.el7.x86_64 --initrd=/boot/initramfs-3.10.0-862.14.4.el7.x86_64.img --command-line="root=UUID=8c1540fa-e2b4-407d-bcd1-59848a73e463 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto console=ttyS0,115200 LANG=en_US.UTF-8"
The version of the kernel and the ramdisk (initramfs) depends on which kernel is installed and the value for
--command-line= is copied from
With all steps set up the combination of everything as seen in the recording above is:
systemctl kexec - to reboot into the kexec kernel
criu restore - included in
With this proof of concept I am able to reboot the OpenVPN server into a new kernel with a downtime of fewer than 10 seconds and without losing the connection between client and server.
So my initial assumption that it should be theoretically possible turned out to be correct and with some manual configuration of a network namespace it actually was easy to get it running.