In a recent GitHub issue on libpod, a user of Podman suggested that rootless containers eliminated the need for the --user
option when running containers. They assumed that the --user
option had originated in Docker to be able to run a container as a different user. Since rootless Podman runs in rootless mode to begin with, it deprecated the need for the option.
They were mistaken.
Investigating UIDs
Rootless and rootful Podman each support running with multiple users. Both, by default, run the initial process as the root of the user namespace they are launched in. When running rootless containers, it launches the first process as the root of the user namespace you are using. In a previous blog, Understanding root inside and outside of the container, I dug deeper into what is happening here.
If you looked at the process from outside of the container, you would see that it is running as your UID.
$ podman run fedora cat /proc/self/uid_map
0 3267 1
1 100000 65536
My UID is 3267, and you can see the user namespace mapping is mapping UID 0 to 3267 for a range of 1 UIDs. Another way to examine this is by using podman top
to display the user inside of the container and the host user.
$ podman run -d fedora sleep 100
41ad82a732526673299d6785105e1b4a0ef4397ed7ceb8b13b9218e2f3a77003
$ podman top -l user huser
USER HUSER
root 3267
If you want to run with a different user within the container, then use -u
to select the user. When running in rootless mode, the root of the container is more powerful than non-root of the container, so it is still advisable to run as non-root in a rootless container.
Even in rootless containers, the root of the container has user namespace capabilities. These capabilities are a subsection of the power of root over the user namespace.
$ podman top -l capeff
EFFECTIVE CAPS
AUDIT_WRITE,CHOWN,DAC_OVERRIDE,FOWNER,FSETID,KILL,MKNOD,NET_BIND_SERVICE,NET_RAW,SETFCAP,SETGID,SETPCAP,SETUID,SYS_CHROOT
As you can see, the rootless process has a bunch of capabilities. You can even run --privileged
and get all of the capabilities. But these capabilities are not the same as the capabilities you get as root; they are user namespace capabilities. They have full control over the namespaces mapped into the container, but no power on other parts of the operating system.
For example, the container has CAP_SETUID. This allows the root process to change its UID to any other UID inside the container. In the case above, it can change the process UID to any UID from 100000 to 165535, as well as back to 3267. The root process is not allowed to setuid to uid (0) or any other UID on the system. Some capabilities like CAP_SYS_ADMIN are stripped down. CAP_NET_ADMIN is only able to manipulate the containers network namespace, but not the hosts.
Now let’s look at what happens when I run the container with the --user
flag.
$ podman run --user 1000 -d fedora sleep 10
976d7f3f034d38657cfba60aef406e7f65eae9eef735619ca7c13f8a946a0122
$ podman top -l user huser
USER HUSER
1000 100999
You can see that the container process is running as UID 1000 inside of the container, but it is actually running as UID 100999 on the host.
Now, we see that the container has no capabilities and is locked down.
$ podman top -l capeff
EFFECTIVE CAPS
none
As long as your container does not need root, I always recommend using the --user
option to improve security further.
Using the --userns=keep-id flag
Just as an addendum, rootless Podman has another cool option: --userns=keep-id
.
The keep-id
option tells Podman to create a user namespace where the current rootless user's UID:GID maps to the same values in the container. When the container is launched, it is running as your UID inside the container and on the host. Many HPC (High-Performance Computing) environments are using this flag and running the entire container with a single non-root UID.
$ podman run --userns keep-id -d fedora sleep 100
319813af33af1f54d2a6a4c00eeb1100dec36e8ba9d4bef76846d0e0dd54a6b8
$ podman top -l user huser
USER HUSER
3267 103266
Unfortunately, writing this blog revealed a bug in podman top
, displaying the wrong host user (i.e., HUSER). If I use the ps
command, I see that the sleep
is actually running on the host as my UID. I have opened an issue to fix the bug. Thanks to Giuseppe Scrivano, the bug is fixed in the next release of Podman.
$ ps -ef | grep sleep
dwalsh 198080 198069 1 10:57 ? 00:00:00 sleep 100
Since the main process of the container is my UID, it no longer has root capabilities.
$ podman top -l capeff
EFFECTIVE CAPS
none
Depending on how the container is configured, processes in the container can use setuid
and setfcap
tools like su
and sudo
to gain additional capabilities, just like a normal login session. Fedora Toolbox uses Podman with the keep-id
option under the covers to give users access to different OS environments.
One potential issue that we have seen users have is when they specify a large UID on rootless containers. Remember that the rootless Podman user is only allocated a limited number of UIDs, as defined in the /etc/subuid
file. Usually, you can only use 65536 UIDs. This means that if you attempt to launch a rootless container with a UID of > 65536, the container will fail. If you have to launch with a larger UID, then you need to modify the /etc/subuid
to include the UID you want to use.
$ podman run --user 70000 fedora id -u
Error: container_linux.go:346: starting container process caused "setup user: invalid argument": OCI runtime error
$ podman run --user 65536 fedora id -u
65536
Conclusion
The --user
option is still very necessary and adds a lot of security even when using rootless Podman, and users should still use it to be as secure as possible.
[ Getting started with containers? Check out this free course. Deploying containerized applications: A technical overview. ]
About the author
Daniel Walsh has worked in the computer security field for over 30 years. Dan is a Senior Distinguished Engineer at Red Hat. He joined Red Hat in August 2001. Dan leads the Red Hat Container Engineering team since August 2013, but has been working on container technology for several years.
Dan helped developed sVirt, Secure Virtualization as well as the SELinux Sandbox back in RHEL6 an early desktop container tool. Previously, Dan worked Netect/Bindview's on Vulnerability Assessment Products and at Digital Equipment Corporation working on the Athena Project, AltaVista Firewall/Tunnel (VPN) Products. Dan has a BA in Mathematics from the College of the Holy Cross and a MS in Computer Science from Worcester Polytechnic Institute.
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit