Editor's note: The original version of this tutorial used Fedora/Alpine as the example container operating system. It has since been replaced with BusyBox which should have all the necessary functionality in order for this to function properly given the fast moving pace of the Podman project.
The ability to use systemd
services to run and manage containers has been requested by users for many years. There were several attempts in Docker’s early days to allow running Docker containers with systemd
, but that functionality turned out to be harder than expected. Why? Systemd must be aware of and have control over the processes running inside the systemd
service to properly manage it. That’s especially important so systemd
can know if the main process is running, and if it’s in a healthy state.
The problem is that Docker’s client-server architecture complicates things. All Docker commands are sent to the Docker daemon, which makes it almost impossible for systemd
to control container processes. Moreover, successful execution of the Docker client does not necessarily imply that the container is up and running. Multiple attempts to improve the situation have been rejected, leaving a lot of room for improvement.
The good news is that Podman is an excellent choice for running containers, and especially so for running them in systemd
services. Let’s take a look at how this works.
systemd
service file generation
Podman’s fork and exec architecture allows systemd
to properly control and manage container processes. In fact, Podman makes putting a container into a systemd
service as simple as calling podman generate systemd $container
. Let’s generate a service for a container:
$ podman create -d --name foo busybox:latest top
54502f309f3092d32b4c496ef3d099b270b2af7b5464e7cb4887bc16a4d38597
$ podman generate systemd --name foo
# container-foo.service
# autogenerated by Podman 1.6.2
# Tue Nov 19 15:49:15 CET 2019
[Unit]
Description=Podman container-foo.service
Documentation=man:podman-generate-systemd(1)
[Service]
Restart=on-failure
ExecStart=/usr/bin/podman start foo
ExecStop=/usr/bin/podman stop -t 10 foo
KillMode=none
Type=forking
PIDFile=/run/user/1000/overlay-containers/54502f309f3092d32b4c496ef3d099b270b2af7b5464e7cb4887bc16a4d38597/userdata/conmon.pid
[Install]
WantedBy=multi-user.target
The generated systemd
service file can now be used to manage the foo
container via systemd
. We can copy the file to ~/.config/systemd/user/container-foo.service
and start a rootless container via systemctl --user start container-foo.service
.
Specific versus generic container services
The ability to generate systemd
service files offers a lot of flexibility to users, and intentionally blurs the difference between a container and any other program or service on the host. Since Podman v1.6, we can also generate service files for pods that can conveniently be written to files via the --files
flag. However, all of these generated files are specific to containers and pods that already exist. As shown in the example above, we first have to create a container or pod and can then generate specific service files. But what if we want to run a new container directly via the service? What if we want to share a service file with other users?
After collecting more experience in this domain and receiving feedback from the community, we sat down and reflected on how we can improve and provide a generic service skeleton that can be used in a backwards compatible fashion with already released versions of Podman in the wild. The good news is that we found such backwards compatible service files, which we shall have a closer look at now:
[Unit]
Description=Podman in Systemd
[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid -d busybox:latest top
ExecStop=/usr/bin/sh -c "/usr/bin/podman rm -f `cat /%t/%n-cid`"
KillMode=none
Type=forking
PIDFile=/%t/%n-pid
[Install]
WantedBy=multi-user.target
The upper service file sets the restart policy to on-failure
, which instructs systemd
to restart the service when, among other things, the service cannot be started or stopped cleanly, or when the process exits non-zero. The ExecStart
line describes how we start the container, the ExecStop
line describes how we stop and remove the container. In this example, we want to run a simple busybox:latest
container in the background that runs top
. But there are two more flags we should look at: --conmon-pidfile
and --cidfile
.
The --conmon-pidfile
flag points to a path to store the process ID for the container’s conmon
process. Conmon is a small monitoring tool that Podman uses to perform operations such as keeping ports and file descriptors open, streaming the container logs, and cleaning up once the container has finished. This command also returns the container’s exit code, which is essential for the systemd
service use case, as we can use the conmon-pidfile
as the PIDFile for the same service. If the container exits non-zero, conmon
will as well, and systemd
can report the correct service status and restart it if needed:
[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid -d busybox:latest top
...
PIDFile=/%t/%n-pid
The --cidfile
flag points to the path that stores the container ID. When running or creating a container, Podman writes the corresponding container ID to the specified path. Doing so allows us to write elegant and generic service files, because we can use the file for stopping or removing the container as well. In the previous example, the ExecStop
line uses a shell trick (i.e., -c
followed by a set of commands for shell interpretation) for stopping the container. Starting with the upcoming release of Podman v1.7, podman stop
and podman rm
support the --cidfile
flag as well, so we don’t need the upper shell trickery anymore:
[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid -d busybox:latest top
ExecStop=/usr/bin/sh -c "/usr/bin/podman rm -f `cat /%t/%n-cid`"
...
Now, let’s look at the specified paths to the conmon-pidfile
and the cidfile
, /%t/%n-pid
and /%t/%n-cid
, which deserve some explanation as well. In these statements, %t
is the path to the run time directory’s root (i.e., /run/user/$UserID
). This is where Podman also stores most of its runtime data. The %n
portion is the full name of the service. Systemd guarantees uniqueness for service names, so we don’t need to worry about potential file name conflicts.
Assuming our service is named foo
and has a user ID of 1000, the corresponding conmon-pidfile
is placed in /run/user/1000/foo.service-pid
, while the cidfile
is placed in /run/user/1000/foo.service-cid
.
Note: It’s important to set the kill mode to none
. Otherwise, systemd
will start competing with Podman to stop and kill the container processes. which can lead to various undesired side effects and invalid states.
A walk-through example
So much for theory—let’s have a look. First, make sure that the file is accessible to our non-root user.
$ cat ~/.config/systemd/user/container.service
[Unit]
Description=Podman in Systemd
[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid -d busybox:latest top
ExecStop=/usr/bin/sh -c "/usr/bin/podman rm -f `cat /%t/%n-cid`"
KillMode=none
Type=forking
PIDFile=/%t/%n-pid
[Install]
WantedBy=multi-user.target
Now, we can load and start the service:
$ systemctl --user daemon-reload
$ systemctl --user start container.service
$ systemctl --user status container.service
● container.service - Podman in Systemd
Loaded: loaded (/home/valentin/.config/systemd/user/container.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2019-11-18 15:32:56 CET; 1min 5s ago
Process: 189705 ExecStartPre=/usr/bin/rm -f //run/user/1000/container.service-pid //run/user/1000/container.service-cid (code=exited, status=0/SUCCESS)
Process: 189706 ExecStart=/usr/bin/podman run --conmon-pidfile //run/user/1000/container.service-pid --cidfile //run/user/1000/container.service-cid -d busybox:latest top (code=exited, status=0/SUCCESS)
Main PID: 189731 (conmon)
CGroup: /user.slice/user-1000.slice/user@1000.service/container.service
├─189724 /usr/bin/fuse-overlayfs [...]
├─189726 /usr/bin/slirp4netns [...]
├─189731 /usr/bin/conmon [...]
└─189737 top
$ podman ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f20988d59920 docker.io/library/busybox:latest top 12 seconds ago Up 11 seconds ago funny_zhukovsky
Great! Systemd started the service successfully, and Podman reports the container as running as well. Note that I trimmed parts of the upper output for brevity. An important part is the Main PID
, which points to the correct conmon
process. Without explicitly pointing systemd
to the correct process via the PIDFile
option, systemd
might wrongly choose another process in this cgroup as the main process. There are a few other processes listed (i.e., fuse-overlayfs
, slirp4nets
, and top
), and they all run in the same cgroup. Fuse-overlayfs is an implementation of the overlay filesystem in user space via Fuse and slirp4nets
allows unprivileged networking. Both of these tools are essential for running rootless containers with Podman.
Before properly stopping the service via systemctl --user stop container.service
, let’s test the restart policy, which is set to on-failure
. We can cause such a failure by killing the top
process (i.e., 189737
):
$ kill -9 189731
$ systemctl --user status container.service
● container.service - Podman in Systemd
Loaded: loaded (/home/valentin/.config/systemd/user/container.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2019-11-18 16:09:38 CET; 1min 3s ago [...]
Main PID: 191263 (conmon)
We can see that the Main PID
has changed from 189731
to 191263
. That’s an expected outcome, as we killed the container process, which hence exited non-zero. Conmon exited with the same exit code and systemd
correctly restarted the service. Note that the service will also be restarted when we manually stop a container via podman stop $container
, because the top
binary in the busybox:latest
container exits with 143
when stopped with SIGTERM. The top
binary from other distributions (e.g., BusyBox) exits with 0 after SIGTERM, so systemd
would not restart the service. Such behavioral differences are extremely important to consider when writing systemd
services, so we need to be careful when setting the restart policy.
Back to work
The nice thing about the generic systemd
service file presented in this article is that it is backwards compatible with versions of Podman running in the wild. May it be Red Hat Enterprise Linux, BusyBox, or Ubuntu, users can immediately follow the suggested format. Nonetheless, the Podman team is continuing to improve the support and user experience when running containers in systemd
services. Try it out!
New to containers? Download the Containers Primer and learn the basics of Linux containers.
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.