Dealing with user namespaces and SELinux on rootless containers

2020 年 6 月 25 日Dan Walsh3 分钟阅读

In this scenario, the user wants to run a MariaDB database container out of their home directory, and they want to mount a volume from their home directory into the container. Let's discover how to manage security when mounting volumes in rootless containers.

Managing SELinux

I have talked several times about how SELinux is an excellent way to confine containers and how simple it is to work with when running a container. The container engine, Podman, launches each container with a unique process SELinux label (usually container_t) and labels all of the container content with a single label (usually container_file_t). We have rules that state that container_t can read and write all content labeled container_file_t. This simple idea has blocked major file system exploits.

Everything works perfectly until the user attempts a volume mount. The problem with volumes is that they usually only bind mounts on the host. They bring in the labels from the host, which the SELinux policy does not allow the process label to interact with, and the container blows up. This is not a bug; it is a feature. Even if users explicitly mount volumes, SELinux will, by default, prevent any access following the "security should never be opt-in" philosophy.

On the first attempt, if the user tries the following command:

$ podman run --rm -v $HOME/mysql-data:/var/lib/mysql/data -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -p 3306:3306 mariadb/server
Permission denied ...

It blows up with permission denied. The user reads the man page, and figures out the problem is SELinux. The user sees that they can add a :Z option to the volume mount, which tells Podman to relabel the volume's content to match the label inside the container. And the SELinux problem is solved.

$ podman run --rm -v $HOME/mysql-data:/var/lib/mysql/data:Z -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -p 3306:3306 mariadb/server
Permission denied …

Oops, sad trombone sound - SELinux is fixed, but now the user hits another issue.

User namespace

This time, the problem is that the $HOME/mysql-data directory is owned by the user. In a previous blog, I talked about how --user works in rootless containers. I explained that the root user of a rootless container, by default, is the user's UID. That means files owned by the user inside of the container are owned by root inside of the container. The issue here is that MariaDB needs to own the database directory, and it does not run as root inside of the container. Instead, it runs as the MariaDB user.

$ podman run -ti mariadb/server grep mysql /etc/passwd
mysql:x:999:999::/home/mysql:/bin/sh

After a little detective work, the user figures out that the MariaDB server runs as the user 999. Therefore, the user needs to chown the mysql-data to be 999:999, so that MariaDB inside of the container can read/write the database.

Now, the user could attempt the following fix:

chown 999:999 -R $HOME/mysql-data

But the user is going to get permission denied. Furthermore, this is the wrong UID:GID pair. Remember that the UID:GID pair is relative to the user namespace that the user is going to run the container with. Now we have a big math problem. We must look at the user namespace the user going to run the container with and then add 999 to the beginning UID of the range - 1. And hope we got it right.

So, the user could try this:

sudo chown CONTAINER999:CONTAINER999 -R $HOME/mysql-data

An easier way to handle this situation would be to use podman unshare. The unshare command is a cool command that joins the user namespace without running any containers.

For example, the user could enter:

podman unshare chown 999:999 -R $HOME/mysql-data

Now the user is ready to run the rootless container with the following command:

$ podman run --rm -v $HOME/mysql-data:/var/lib/mysql/data:Z -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=db -p 3306:3306 mariadb/server

And Eureka! It works.

Conclusion

Running containers in a rootless environment is very secure, and most containers will work out of the box. But when you start adding --volumes, you can have issues with some of the security mechanisms protecting your host from the container. Understanding what is going on will save you a lot of time and aggravation.

[ Getting started with containers? Check out this free course - Deploying containerized applications: A technical overview. ]

关于作者

Dan Walsh

Senior Distinguished Engineer

Daniel Walsh has worked in the computer security field for over 30 years. Dan is a Senior Distinguished Engineer at Red Hat. He joined Red Hat in August 2001. Dan leads the Red Hat Container Engineering team since August 2013, but has been working on container technology for several years.

Dan helped developed sVirt, Secure Virtualization as well as the SELinux Sandbox back in RHEL6 an early desktop container tool. Previously, Dan worked Netect/Bindview's on Vulnerability Assessment Products and at Digital Equipment Corporation working on the Athena Project, AltaVista Firewall/Tunnel (VPN) Products. Dan has a BA in Mathematics from the College of the Holy Cross and a MS in Computer Science from Worcester Polytechnic Institute.

Read full bio

按频道浏览

探索所有频道

Dealing with user namespaces and SELinux on rootless containers

关于作者

Dan Walsh

更多此类内容

按频道浏览

平台

工具

试用购买与出售

联系我们

关于红帽

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links