How to use Podman inside of Kubernetes
Podman in Kubernetes/OpenShift
In part one, the focus was on Podman in Podman scenarios. We saw some of the different rootful and rootless Podman combinations. We also discussed the ramifications of the --privileged
flag.
But what about Podman and Kubernetes? There are plenty of options available for relating these two services, as well.
For part two of the series, I am using a Kubernetes cluster running with CRI-O as the runtime.
[ Free cheat sheet: Kubernetes glossary ]
Rootful Podman with the privileged flag set
Here we're running a privileged container with the root user so that Podman will run as root inside the container.
Here is the YAML file: rootful-priv.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: podman-priv
spec:
containers:
- name: priv
image: quay.io/podman/stable
args:
- sleep
- "1000000"
securityContext:
privileged: true
➜ kubectl exec -it podman-priv -- sh
sh-5.0# id
uid=0(root) gid=0(root) groups=0(root)
sh-5.0# podman run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob fdb393d8227c done
Copying blob 6b536614e8f8 done
Copying config 4199acc83c done
Writing manifest to image destination
Storing signatures
hello
We can also successfully build images inside the privileged container with rootful Podman. Let's build an image where we install BusyBox on Fedora.
sh-5.0# cat Containerfile
FROM fedora
RUN dnf install -y busybox
ENV foo=bar
sh-5.0# podman build -t myimage -f Containerfile .
STEP 1: FROM fedora
STEP 2: RUN dnf install -y busybox
Fedora 33 openh264 (From Cisco) - x86_64 3.0 kB/s | 2.5 kB 00:00
Fedora Modular 33 - x86_64 1.4 MB/s | 3.3 MB 00:02
Fedora Modular 33 - x86_64 - Updates 1.3 MB/s | 3.1 MB 00:02
Fedora 33 - x86_64 - Updates 1.6 MB/s | 27 MB 00:16
Fedora 33 - x86_64 3.6 MB/s | 72 MB 00:19
Dependencies resolved.
...
Running transaction
Preparing : 1/1
Installing : busybox-1:1.32.1-1.fc33.x86_64 1/1
Running scriptlet: busybox-1:1.32.1-1.fc33.x86_64 1/1
Verifying : busybox-1:1.32.1-1.fc33.x86_64 1/1
Installed:
busybox-1:1.32.1-1.fc33.x86_64
Complete!
--> 734a45854d1
STEP 3: ENV foo=bar
STEP 4: COMMIT myimage
--> 2326e34ac82
2326e34ac82173c849e0282b6644de5326f6b5bfba8431cf1c1115d846e440e9
sh-5.0# podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/myimage latest 2326e34ac821 48 seconds ago 427 MB
registry.fedoraproject.org/fedora latest 9f2a56037643 3 months ago 182 MB
sh-5.0# podman run myimage busybox
BusyBox v1.32.1 (2021-03-22 18:56:41 UTC) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --show SCRIPT
or: busybox --install [-s] [DIR]
or: function [arguments]...
...
Rootless Podman with the privileged flag set
Here we're running a privileged container with the podman(1000) user so that Podman runs as user 1000 inside the container.
Here is the YAML file: rootless-priv.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: podman-rootless
spec:
containers:
- name: rootless
image: quay.io/podman/stable
args:
- sleep
- "1000000"
securityContext:
privileged: true
runAsUser: 1000
➜ kubectl exec -it podman-rootless -- sh
sh-5.0$ id
uid=1000(podman) gid=1000(podman) groups=1000(podman)
sh-5.0$ podman run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob 6b536614e8f8 done
Copying blob fdb393d8227c done
Copying config 4199acc83c done
Writing manifest to image destination
Storing signatures
hello
We can also successfully build images inside the privileged container with rootless Podman. Let's build an image where we install BusyBox on fedora.
sh-5.0$ cat Containerfile
FROM fedora
RUN dnf install -y busybox
ENV foo=bar
sh-5.0$ podman build -t myimage -f Containerfile .
STEP 1: FROM fedora
Resolved "fedora" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Getting image source signatures
Copying blob 157ab8011454 done
Copying config 9f2a560376 done
Writing manifest to image destination
Storing signatures
STEP 2: RUN dnf install -y busybox
Fedora 33 openh264 (From Cisco) - x86_64 4.8 kB/s | 2.5 kB 00:00
Fedora Modular 33 - x86_64 462 kB/s | 3.3 MB 00:07
Fedora Modular 33 - x86_64 - Updates 520 kB/s | 3.1 MB 00:06
Fedora 33 - x86_64 - Updates 7.5 MB/s | 27 MB 00:03
Fedora 33 - x86_64 522 kB/s | 72 MB 02:20
Dependencies resolved.
...
Installed:
busybox-1:1.32.1-1.fc33.x86_64
Complete!
--> 92087429448
STEP 3: ENV foo=bar
STEP 4: COMMIT myimage
--> 16dd65e3f57
16dd65e3f57a5808035b713a6ba3267146caf2a03dd4205097a5727f9d326de9
sh-5.0$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/myimage latest 16dd65e3f57a About a minute ago 427 MB
registry.fedoraproject.org/fedora latest 9f2a56037643 3 months ago 182 MB
sh-5.0$ podman run myimage busybox
BusyBox v1.32.1 (2021-03-22 18:56:41 UTC) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --show SCRIPT
or: busybox --install [-s] [DIR]
or: function [arguments]...
...
[ Getting started with containers? Check out this free course. Deploying containerized applications: A technical overview. ]
Rootless Podman without the privileged flag
To eliminate the privileged flag, we need to do the following:
- Devices:
/dev/fuse
is required to use fuse-overlayfs inside of the container, this option tells Podman on the host to add/dev/fuse
to the container so that containerized Podman can use it. - Disable SELinux: SELinux does not allow containerized processes to mount all of the file systems required to run inside a container. So we need to disable SELinux on the host that is running the Kubernetes cluster.
To be able to mount a device in Kubernetes, you first have to create a Device Plugin and then use that in the pod spec.
Here is an example of a Device Plugin for /dev/fuse
: https://github.com/kuberenetes-learning-group/fuse-device-plugin/blob/main/fuse-device-plugin-k8s-1.16.yml.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fuse-device-plugin-daemonset
namespace: kube-system
spec:
selector:
matchLabels:
name: fuse-device-plugin-ds
template:
metadata:
labels:
name: fuse-device-plugin-ds
spec:
hostNetwork: true
containers:
- image: soolaugust/fuse-device-plugin:v1.0
name: fuse-device-plugin-ctr
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
imagePullSecrets:
- name: registry-secret
Here is the YAML file: rootless-no-priv.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: no-priv
spec:
containers:
- name: no-priv
image: quay.io/podman/stable
args:
- sleep
- "1000000"
securityContext:
runAsUser: 1000
resources:
limits:
github.com/fuse: 1
volumeMounts:
- mountPath: /home/podman/.local/share/containers
name: podman-local
volumes:
- name: podman-local
hostPath:
path: /home/umohnani/.local/share/containers
✗ kubectl exec -it no-priv -- sh
sh-5.0$ id
uid=1000(podman) gid=1000(podman) groups=1000(podman)
sh-5.0$ podman run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob 55eda7743468 done
Copying blob 4b21dcdd136d done
Copying config 613e5da7a9 done
Writing manifest to image destination
Storing signatures
hello
sh-5.1$ cat containerfile
FROM ubi8
RUN echo "hello"
ENV foo=bar
sh-5.1$ podman build --isolation chroot -t myimage -f containerfile .
STEP 1: FROM ubi8
STEP 2: RUN echo "hello"
hello
--> 096250be78f
STEP 3: ENV foo=bar
STEP 4: COMMIT myimage
--> ea849ac9875
Ea849ac9875eb926d743362bce2e32e90d34fda7a88f28ebd6a1a546db99338f
sh-5.1$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/myimage latest ea849ac9875e 41 seconds ago 245 MB
registry.access.redhat.com/ubi8 latest 0724f7c987a7 3 weeks ago 245 MB
Rootful Podman without the privileged flag
Create your device plugin as shown above.
You'll need to add the following capabilities for this:
- CAP_SYS_ADMIN is required for the Podman running as root inside of the container to mount the required file systems.
- CAP_MKNOD is required for Podman running as root inside of the container to create the devices in
/dev.
(Note that Docker allows this by default). - CAP_SYS_CHROOT and CAP_SETFCAP are required as they are part of the default list of capabilities in Podman, and when you run a Podman command, it adds the capabilities it needs, so if you run your
k8s pod
without this capability, Podman fails.
Here is the YAML file: rootful-no-priv.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: no-priv-rootful
spec:
containers:
- name: no-priv-rootful
image: quay.io/podman/stable
args:
- sleep
- "1000000"
securityContext:
capabilities:
add:
- "SYS_ADMIN"
- "MKNOD"
- "SYS_CHROOT"
- "SETFCAP"
resources:
limits:
github.com/fuse: 1
✗ kubectl exec -it no-priv-rootful -- sh
sh-5.0# id
uid=0(root) gid=0(root) groups=0(root)
sh-5.0# podman run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob 55eda7743468 done
Copying blob 4b21dcdd136d done
Copying config 613e5da7a9 done
Writing manifest to image destination
Storing signatures
hello
Podman-remote in a Kubernetes pod with the Podman socket running on the host
You need to do the following to set up for this use case:
- Disable SELinux on the host.
- Follow this article to enable the Podman socket on your host.
Here is the YAML file: remote.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: podman-remote
spec:
containers:
- name: remote
image: quay.io/podman/stable
args:
- sleep
- "1000000"
volumeMounts:
- mountPath: /var/run/podman
name: podman-sock
volumes:
- name: podman-sock
hostPath:
path: /var/run/podman
We're leaking the Podman socket that is running on the host into the pod by creating a volume mount for it.
✗ kubectl exec -it podman-remote -- sh
sh-5.0# id
uid=0(root) gid=0(root) groups=0(root
sh-5.0# podman --remote run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob sha256:55eda774346862e410811e3fa91cefe805bc11ff46fad425dd1b712709c05bbc
Copying blob sha256:4b21dcdd136d133a4df0840e656af2f488c226dd384a98b89ced79064a4081b4
Copying config sha256:613e5da7a934e1963e37ed935917e8be6b8dfd90cac73a724ddc224fbf16da20
Writing manifest to image destination
Storing signatures
hello
Builds with the Podman socket leaked into the container:
sh-5.0# cat /home/podman/Containerfile
FROM fedora
RUN dnf install -y busybox
ENV foo=bar
sh-5.0# podman --remote build -t myimage -f Containerfile .
STEP 1: FROM fedora
STEP 2: RUN dnf install -y busybox
Fedora 33 openh264 (From Cisco) - x86_64 4.7 kB/s | 2.5 kB 00:00
Fedora Modular 33 - x86_64 1.8 MB/s | 3.3 MB 00:01
Fedora Modular 33 - x86_64 - Updates 5.2 MB/s | 3.1 MB 00:00
Fedora 33 - x86_64 - Updates 4.3 MB/s | 27 MB
00:06
Fedora 33 - x86_64 1.0 MB/s | 72 MB
01:13
Dependencies resolved.
...
Installed:
busybox-1:1.32.1-1.fc33.x86_64
Complete!
--> 6ef78b975e1
STEP 3: ENV foo=bar
STEP 4: COMMIT myimage
--> 481c5a0e453
481c5a0e4534573a3872f7cc1ff6806a3ce143edce2ed39568d23efe6f65a292
sh-5.0# podman --remote images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/myimage latest 481c5a0e4534
2 minutes ago 427 MB
registry.fedoraproject.org/fedora latest
9f2a56037643 3 months ago 182 MB
sh-5.0# podman --remote run myimage busybox
BusyBox v1.32.1 (2021-03-22 18:56:41 UTC) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --show SCRIPT
or: busybox --install [-s] [DIR]
or: function [arguments]...
...
[ Learn the basics of using Kubernetes in this free cheat sheet. ]
Podman in a locked-down container using user namespaces in Kubernetes
This only works if you are using CRI-O as your runtime engine for your Kubernetes cluster.
We need to add the userns annotation to the runtime (e.g., runc
, crun
, kata
, etc.) you'll be using with CRI-O.
[crio.runtime.runtimes.runc]
runtime_path = ""
runtime_type = "oci"
runtime_root = "/run/runc"
allowed_annotations = [
"io.containers.trace-syscall",
"io.kubernetes.cri-o.userns-mode",
]
Add the Podman UID/GID ranges to the subuid
and subgid
files on the host.
✗ cat /etc/subuid
umohnani:100000:65536
containers:200000:268435456
✗ cat /etc/subgid
umohnani:100000:65536
containers:200000:268435456
Restart CRI-O after this and then start up your Kubernetes cluster:
✗ sudo systemctl restart cri-o
✗ ./local-cluster-up.sh
Since we're running this without the privileged flag, we need to mount /dev/fuse
, as shown in the examples above. So, create your /dev/fuse
Device Plugin to be used in the pod spec.
Here is the YAML file: userns.yaml
:
apiVersion: v1
kind: Pod
metadata:
name: podman-userns
annotations:
io.kubernetes.cri-o.userns-mode: "auto:size=65536;keep-id=true"
spec:
containers:
- name: userns
image: quay.io/podman/stable
command: ["sleep", "10000"]
securityContext:
capabilities:
add:
- "SYS_ADMIN"
- "MKNOD"
- "SYS_CHROOT"
- "SETFCAP"
resources:
limits:
github.com/fuse: 1
We've added the userns annotation to the podspec specifying the range of UIDs/GIDs to use and what ID should be set in the container—it'll be set to the root user in this case.
✗ kubectl exec -it podman-userns -- sh
sh-5.0# id
uid=0(root) gid=0(root) groups=0(root)
sh-5.0# cat /proc/self/uid_map
0 265536 65536
sh-5.0# cat /proc/self/gid_map
0 265536 65536
sh-5.0# podman run ubi8 echo hello
Resolved "ubi8" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Trying to pull registry.access.redhat.com/ubi8:latest...
Getting image source signatures
Copying blob 4b21dcdd136d done
Copying blob 55eda7743468 done
Copying config 613e5da7a9 done
Writing manifest to image destination
Storing signatures
hello
Builds with rootful Podman in a locked-down container with usernamespaces
sh-5.0# cat Containerfile
FROM fedora
RUN dnf install -y busybox
ENV foo=bar
sh-5.0# podman build -t myimage -f Containerfile .
STEP 1: FROM fedora
Resolved "fedora" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)
Getting image source signatures
Copying blob 157ab8011454 done
Copying config 9f2a560376 done
Writing manifest to image destination
Storing signatures
STEP 2: RUN dnf install -y busybox
Fedora 33 openh264 (From Cisco) - x86_64 764 B/s | 2.5 kB 00:03
Fedora Modular 33 - x86_64 348 kB/s | 3.3 MB 00:09
Fedora Modular 33 - x86_64 - Updates 2.2 MB/s | 3.1 MB 00:01
Fedora 33 - x86_64 - Updates 11 MB/s | 27 MB 00:02
Fedora 33 - x86_64 2.1 MB/s | 72 MB 00:34
Dependencies resolved.
...
Installed:
busybox-1:1.32.1-1.fc33.x86_64
Complete!
--> 1b0633e5309
STEP 3: ENV foo=bar
STEP 4: COMMIT myimage
--> 2212a101136
2212a1011369ee7e6a4a5d4c15a56fc531a5d43ac24f49d432730c620cec4378
sh-5.0# podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/myimage latest 2212a1011369 About a minute ago 427 MB
registry.fedoraproject.org/fedora latest 9f2a56037643 3 months ago 182 MB
sh-5.0# podman run myimage busybox
BusyBox v1.32.1 (2021-03-22 18:56:41 UTC) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
Licensed under GPLv2. See source distribution for detailed
copyright notices.
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --show SCRIPT
or: busybox --install [-s] [DIR]
or: function [arguments]...
...
Final thoughts
Here, in part two of the article series, I demonstrated various use cases related to Podman and Kubernetes interactions. Many of the choices are similar to those we saw in the part one article with Podman in Podman.
[ Get this free book from Red Hat and O'Reilly - Kubernetes Operators: Automating the Container Orchestration Platform. ]
Series wrap up
It's common for the Podman team to field questions related to running Podman inside containers. There are many possible approaches to doing this, with various related security concerns.
One of the biggest differentiators is Podman on Podman or Podman within Kubernetes, along with how Docker plays into the discussion.
As you start to implement Podman in these scenarios, don't forget the privileges information discussed at the start of article one, and be sure to weigh the considerations regarding the --privileged
flag. Contact the Podman team for more information.
Don't forget that Enable Sysadmin has lots of Podman content.
Urvashi Mohnani
Urvashi Mohnani is a senior software engineer at Red Hat on the Container Runtimes team. She has spent the past few years developing emerging open source container technologies such as CRI-O, Buildah, and Podman and presenting on the latest developments in the space. More about me
Dan Walsh
Daniel Walsh has worked in the computer security field for over 30 years. Dan is a Consulting Engineer at Red Hat. He joined Red Hat in August 2001. Dan leads the Red Hat Container Engineering team since August 2013, but has been working on container technology for several years. More about me