フィードを購読する

Do rootless containers sound interesting? What exactly are rootless containers? It sounds so mysterious, right? Rootless containers are Linux containers that run as non-root, unprivileged users. In this post, we’ll go into what rootless containers are, as well as how you can test them on Red Hat Enterprise Linux (RHEL) 7.6.

But, let’s ask the question another way, why should you need root to run containers? The whole point of a container is to limit the capabilities of a process to only those which they need right? And, that’s exactly why rootless containers are so interesting.

Why are they called rootless containers? Well, sometimes, every now and then, a name just sticks, and that’s what has happened with rootless containers.

The Problem

You might be saying to yourself, “but I have rootless containers right now with Docker - I run my docker commands as a regular user and it works all the time.”  Even though you are executing the docker command line tool without root, the docker daemon is executing those requests as root on your behalf, like this:

Docker Client (TCP/Unix Socket) -> Docker Daemon (Parent/Child Processes) -> Container

When your client connects to the daemon, you literally have root access on the system.

If a user breaks something by mistake, or worse, on purpose, it’s almost impossible to figure out who did it (or when). That’s not rootless.

To demonstrate the problem, try the following example. First make a note of the numeric user ID (UID) for your regular login (I’ll explain why later):

[fatherlinux@rhel7 ~]$ cat /proc/self/loginuid
1000

Now let’s run a container. Containers are supposed to be isolated, so this should be safe, right? Notice that we are mounting the root file system / as /host in the container, then setting up a chroot into it:

[fatherlinux@rhel7 ~]$ Docker run -ti --privileged -v /:/host fedora chroot /host /bin/sh

Now let’s try some commands that would require root access:

sh-4.2# touch /etc/passwd
sh-4.2# cat /etc/shadow

Those commands succeeded, and if you were paying attention, you might notice that you are touching and looking at the host’s files, not an isolated container.  So after creating this container the fatherlinux user has just become root on the host system (queue evil laugh).  

OK, let’s see if auditing could at least track what this user is doing in this container.  On Red Hat systems (and most Linux systems), /proc/self/loginuid is recorded when you first log into a system, and can’t be changed no matter what you do. The idea is that even when you run sudo or su commands, your login UID can be tracked to prove who ran commands as root.  This enables system auditing to track who is doing something, even when they’ve switched user IDs.

Let’s check whether the operations in the container will still be logged under the same numeric UID. Run the following command, still in the container:

sh-4.2# cat /proc/self/loginuid 
4294967295

That’s not the same numeric user ID you noted earlier. (The astute reader might notice that number isn’t arbitrary number - that’s 2^32 - 1, or 0xFFFFFFFF.  When interpreted as a signed 32 bit value that’s -1 indicating an error.)

So how did this happen if the loginuid can’t be changed?

When you fire up a container with the docker client, you are talking to an already running server, which fires off some subprocess which eventually fires up your container. When processes are called with a parent->child mechanism, the loginuid is preserved. On the other hand, when a client talks to a server, over a Unix or TCP socket (docker client talks to the daemon), the parent->child connection to the user who ran the command does not exist. This prevents the loginuid from working correctly.

The login UID of your containerized processes inherits the login UID of the docker server instead of your user. The docker server is running as root and was most likely started by systemd.

So, what does all of this mean? It means, your docker users have root, and nobody can track them. That’s bad.

The Solution

The solution to this problem is to use a container engine that truly runs as your user, not as root. Coincidentally, Podman has a cool new feature to let you do this. You can test it upstream in Fedora, but it’s coming to RHEL too. As of RHEL 7.6, you can test it as Developer Preview with plans for it to be fully supported in RHEL 7.7.

First, make sure you’ve got an up-to-date installation of RHEL 7.6 Server that has been registered so packages can be downloaded from Red Hat. Note: You can download RHEL 7.6 server and get a no-cost subscription through the Red Hat Developer program, if you don’t already have a subscription through your organization.

Install Container Tools

If you can’t find podman, make sure you enable the Extras repo first:

subscription-manager repos --enable=rhel-7-server-extras-rpms

Now, install Podman (and Buildah, and Skopeo while we are at it):

yum install -y podman skopeo buildah

Test Podman as Root

The first step is to do some simple testing:

podman pull rhel7
podman run -it rhel7 bash
cat /etc/redhat-release
uname -a

OK now, if that looks good, let’s get crazy…

Rootless Podman

Running regular containers with Podman is cool, but let’s go rootless. First, as root, let’s do some hacking. Just a warning, we are entering unsupported territory, so your mileage may vary. Do not run these commands on a production system.

We will install a set of development packages from the Copr build service which we use in the Fedora community (see them here). These packages were built by an engineer on our team named Vincent Batts. I asked Vincent to build these packages for this preview with RHEL 7.6.

Now, let’s use these packages and make a few modifications on a development system. First install the development packages:

curl -o /etc/yum.repos.d/rhel7.6-rootless-preview.repo https://copr.fedorainfracloud.org/coprs/vbatts/shadow-utils-newxidmap/repo/epel-7/vbatts-shadow-utils-newxidmap-epel-7.repo

yum install -y shadow-utils46-newxidmap slirp4netns 

Enable a range of namespaces. This is what maps root in the container, to a regular user outside the container:

echo 10000 > /proc/sys/user/max_user_namespaces

Add a new user, and set the password:

useradd fatherlinux

passwd fatherlinux

Manually add some entries in /etc/subuid and /etc/subgid (the useradd command provided in shadow-utils 4.6 handle this at GA in RHEL 7.7). These are the entries that give a regular user a range of UIDs to use in your containers:

echo "fatherlinux:100000:65536" >> /etc/subuid
echo "fatherlinux:100000:65536" >> /etc/subgid

With above instructions completed, you  will be able to run containers as this new user. As of today, you have to SSH in to get all of the right environment variables (su – fatherlinux won’t work):

ssh fatherlinux@10.10.181.178

Now, pull an image:

podman pull rhel7

Since you performing these operations as a regular user, container images will be stored in your home directory. Since you aren’t root, podman won’t be able to write to the main systems image cache (/var/lib/containers):

~/.local/share/containers/storage/

Inspect that the image is pulled locally:

podman images

Finally, let’s run a container. Fingers crossed:

podman run -it rhel7 bash
cat /etc/redhat-release

Output:

Red Hat Enterprise Linux Server release 7.6 (Maipo)

You just ran a container as a regular user, congratulations!

Conclusion

Rootless container takes advantage of the RHEL systems User Namespace support to allow users to run containers without requiring any additional privileges all the while preserving auditing on your systems. This improves security, and manageability of containers in RHEL.  You can test rootless containers today in RHEL 7.6 and 8.0 Beta depending on your needs.

The work we are doing in Podman and the User Namespace separated containers is also the foundation for the work we are doing on CRI-O in OpenShift 4.X.  You have to admit, that’s kinda cool. Stay tuned for more to come with Podman, Buildah, Skopeo, CRI-O, and crictl. There is a ton of work going on in this space.


執筆者紹介

At Red Hat, Scott McCarty is Senior Principal Product Manager for RHEL Server, arguably the largest open source software business in the world. Focus areas include cloud, containers, workload expansion, and automation. Working closely with customers, partners, engineering teams, sales, marketing, other product teams, and even in the community, he combines personal experience with customer and partner feedback to enhance and tailor strategic capabilities in Red Hat Enterprise Linux.

McCarty is a social media start-up veteran, an e-commerce old timer, and a weathered government research technologist, with experience across a variety of companies and organizations, from seven person startups to 20,000 employee technology companies. This has culminated in a unique perspective on open source software development, delivery, and maintenance.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

チャンネル別に見る

automation icon

自動化

テクノロジー、チームおよび環境に関する IT 自動化の最新情報

AI icon

AI (人工知能)

お客様が AI ワークロードをどこでも自由に実行することを可能にするプラットフォームについてのアップデート

open hybrid cloud icon

オープン・ハイブリッドクラウド

ハイブリッドクラウドで柔軟に未来を築く方法をご確認ください。

security icon

セキュリティ

環境やテクノロジー全体に及ぶリスクを軽減する方法に関する最新情報

edge icon

エッジコンピューティング

エッジでの運用を単純化するプラットフォームのアップデート

Infrastructure icon

インフラストラクチャ

世界有数のエンタープライズ向け Linux プラットフォームの最新情報

application development icon

アプリケーション

アプリケーションの最も困難な課題に対する Red Hat ソリューションの詳細

Original series icon

オリジナル番組

エンタープライズ向けテクノロジーのメーカーやリーダーによるストーリー