Podman checkpoint / restore and SQL Server 2019

2 septembre 2020Adrian Reber4 minutes (temps de lecture)

One of Podman’s features is to be able to checkpoint and restore running containers. This means that Podman can "pause" a running container indefinitely, and then restart the container from where it left off on the same or another system. In this post, we'll look at how you can do this with a container running Microsoft's SQL Server.

Podman uses CRIU (Checkpoint/Restore In Userspace) to do the actual checkpointing and restoring of the processes inside of the container. This time I am using Podman 1.64 on Red Hat Enterprise Linux 8.2 with Microsoft’s SQL Server 2019. The documentation about how to set up SQL Server 2019 using Podman is on Microsoft's site.

To start a container with SQL Server 2019 I am using this command on my RHEL 8 system:

# podman run -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=Pass0..0worD' --cap-add cap_net_bind_service -p 1433:1433 -d mcr.microsoft.com/mssql/rhel/server:2019-CU1-rhel-8

If you want to persist your data even after running podman rm it is necessary, as described in the documentation, to mount a host directory as a data volume into your container using:

-v <host directory>:/var/opt/mssql

To access the database running in the container I am installing the sqlcmd tool using the curl command and then yum.:

# curl https://packages.microsoft.com/config/rhel/7/prod.repo > /etc/yum.repos.d/msprod.repo
# yum -y install mssql-tools unixODBC-devel

Once the container is running I am able to connect to the server process using the just installed sqlcmd:

# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> SELECT Name from sys.Databases
2> go
Name                                                                                                                         
----------------------------------------------------------------
master 
tempdb
model 
msdb 

(4 rows affected)
1>

To add some data to the database I created a few small SQL scripts:

#️ cat create-table.sql
CREATE TABLE things (id INT, name NVARCHAR(50))
GO
#️ for i in `seq 1 10000`; do echo "insert into things values($i, 'thing $i')"; done > insert.sql ; echo go >> insert.sql

Running those two SQL scripts (create-table.sql, insert.sql) enables me to query the database:

# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD -i create-table.sql
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD -i insert.sql
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> select count(*) from things
2> go
        
-----------
   10000

(1 rows affected)
1>

At this point the SQL Server 2019 is running and has data to answer our queries. Now I would like to reboot the system to install a new kernel. Podman’s checkpoint/restore feature can help me now. First I will checkpoint the container then reboot the system and once the system has been rebooted, I will restore the container from the checkpoint.

# podman container checkpoint -l --tcp-established
# reboot

Without telling Podman (and thus CRIU) to checkpoint the container while keeping the established TCP connection intact (--tcp-established) the checkpointing will fail.The container has been now checkpointed with all its data and once the system has been rebooted, I can restore the container and query the database again:

# podman container restore -l --tcp-established
# /opt/mssql-tools/bin/sqlcmd -S 127.0.0.1 -U SA -P Pass0..0worD
1> select count(*) from things
2> go
        
-----------
   10000

(1 rows affected)
1>

Restoring the container from the checkpoint after the reboot gives me back the database in the same state it was in before doing the checkpoint. All my data is still there even after having rebooted my system, which means I can reboot into a new kernel without losing the state of my database.

To make this work I had to actually change one of CRIU’s configuration files to handle open but deleted files correctly:

# cat /etc/criu/runc.conf
ghost-limit 40000000

In its default configuration CRIU will give an error message that it has a size limit for this type of files and therefore this configuration file change is necessary.

The value for ghost-limit does not depend on the database size. To identify the right value for ghost-limit, I was running the command podman container checkpoint -l which lead to the following error message:

# podman container checkpoint -l
ERRO[0000] container is not destroyed                
ERRO[0000] criu failed: type NOTIFY errno 0
log file: /var/lib/containers/storage/overlay-containers/<ID>/userdata/dump.log

Looking at the log file dump.log I saw:

Error (criu/files-reg.c:899): Can't dump ghost file /var/opt/mssql/.system/profiles/Temp/82ff05c0df15b61e96c09a878c06ed07 of 20975616 size, increase limit

Increasing the ghost-limit to 40000000 as mentioned above resolves this error messages

Podman also offers the possibility to take multiple checkpoints of a container. Using the option --export Podman can be told to create external checkpoints:

# podman container checkpoint -l --leave-running --tcp-established --export=/tmp/checkpoint1.tar.gz

This tells Podman to create a checkpoint and store all the information about this checkpoint in the file /tmp/checkpoint1.tar.gz while the container keeps on running (--leave-running). Using a different file name with the --export option I can create as many checkpoints necessary at different points in time. This exported checkpoint can then be transferred to another system and I can migrate the running database from one system to another using the podman container restore command like this:

# podman container restore --import=/tmp/checkpoint1.tar.gz --tcp-established

If I want to migrate the container to another system, and if the data is made to persist using a host directory as data volume (-v <host directory>:/var/opt/mssql), it is also necessary to transfer that host directory to the destination host of the container migration. If no host directory is mounted as a data volume the checkpoint archive created using the --export option contains all relevant information and data to migrate the container from one host to another.

Podman’s checkpoint and restore features enables me to reboot my system without losing the state of my running database. One of the advantages of this technology is that data which has already been cached in memory stays cached and queries can be answered just as fast as before a reboot. It also enables me to move my database to another host without losing the state of my database.