Balancing size and features is a universal challenge when building software. So, it's unsurprising that this holds true when building container images. If you don’t include enough packages in your base image, you end up with images which are difficult to troubleshoot, missing something you need, or just cause different development teams to add the exact same package to layered images (causing duplication). If you build it too big, people complain because it takes too long to download - especially for quick and dirty projects or demos. This is where Buildah comes in.
In the currently available ecosystem of build tools, there are two main kinds of build tools:
- Ones which build container images from scratch.
- Those that build layered images.
Buildah is unique in that it elegantly blurs the line between both - and, it has a rich set of capabilities for each. One of those rich capabilities is multi-stage builds.
At Red Hat Summit 2018 in San Francisco, Scott McCarty and I boiled the practice of building production ready containers down into five key tenets - standardize, minimize, delegate, process, and iterate (video & presentation).
Two tenets in particular are often at odds - standardize and minimize. It makes sense to standardize on a rich base image, while at the same time minimizing the content in layered builds. Balancing both is tricky, but when done right, reaps the benefits of OCI image layers at scale (lots of applications) and improves registry storage efficiency.
A particularly powerful example of how to achieve this balance is the concept of multi-stage builds. Since build dependencies like compilers and package managers are rarely required at runtime, we can exclude them from the final build by breaking it into two parts. We can do the heavy lifting in the first part, then use the build artifacts (think Go binaries or jars) in the second. We will then use the container image from the second build in production.
Using this methodology leverages the power of rich base images, while at the same time, results in a significantly smaller container image. The resultant image isn't carrying additional dependencies that aren't used during runtime. The multi-stage build concept became popular last year with the release of Docker v17.05, and OpenShift has long had a similar capability with the concept of chaining builds.
OK, multi-stage builds are great, you get it, but to make this work right, the two builds need to be able to copy data between them. Before we tackle this, let's start with some background.
Buildah was a complete rethink of how container image builds could and should work. It follows the Unix philosophy of small, flexible tools. Multi-stage builds were part of the original design and have been possible since its inception. With the release of Buildah 1.0, users can now take advantage of the simplicity of using multi-stage builds with the Dockerfile format. All of this, with a smaller tool, no daemon, and tons of flexibility during builds (ex. build time volumes).
Below we’ll take a look at how to use Buildah to accomplish multi-stage builds with a Dockerfile and also explore a simpler, yet more sophisticated way to tackle them.
$buildah bud -t [image:tag] .
….and that’s it! Assuming your Dockerfile is written for multi-stage builds and in the directory the command is executed, everything will just work. So if this is all you’re looking for, know that it’s now trivial to accomplish this with Buildah in Red Hat Enterprise Linux 7.5.
Now, let’s dig a little deeper and take a look at using Buildah’s native commands to achieve the same outcome and some reasons why this can be a powerful alternative for certain use cases.
For clarity, we’ll start by using Alex Ellis’s blog post that demonstrates the benefits of performing multi-stage builds. Use of this example is simply to compare and contrast the Dockerfile version with Buildah’s native capabilities. It's not an endorsement any underlying technologies such as Alpine Linux or APK. These examples could all be done in Fedora, but that would make the comparison less clear.
Using Buildah Commands
Using his https://github.com/alexellis/href-counter we can convert the included Dockerfile.multi file to a simple script like this:
# build container
buildcntr1=$(buildah from golang:1.7.3)
buildmnt1=$(buildah mount $buildcntr)
Using simple variables like this are not required, but they will make the later commands clearer to read so it’s recommended. Think of the buildcntr1 as a handle which represents the container build, while the variable buildmnt1 represents a directory which will mount the container.
buildah run $buildcntr1 go get -d -v golang.org/x/net/html
This is the first command verbatim in the original Dockerfile. All that’s needed is to change RUN to run and point Buildah to the container we want to execute the command in. Once, the command completes, we are left with a local copy of the go program. Now we can move it wherever we want. Buildah has a native directive to copy the contents out of a container build:
buildah copy $buildcntr1 app.go .
Alternatively, we can use the system command to do the same thing by referencing the mount point:
cp app.go $buildmnt1/go
For this example both of these lines will accomplish the same thing. We can use buildah’s copy command the same way the COPY command works in a Dockerfile, or we can simply use the host’s cp command to perform the task of copying the binary out of the container. In the rest of this tutorial, we’ll rely on the hosts command.
Now, let's build the code:
buildah run $buildcntr1 /bin/sh -c "CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app ."
The same applies to this command. We’re changing RUN to run and executing the command in the same container:
# runtime container
buildcntr2=$(buildah from alpine:latest)
buildmnt2=$(buildah mount $buildcntr2)
Now let’s define a separate runtime image that we’ll use to run our application in production with.
buildah run $buildcntr2 apk --no-cache add ca-certificates
Same tweaks for the RUN command
#buildah copy $buildcntr2 $buildmnt1/go/app .
cp $buildmnt1/go/app $buildmnt2
Here we have the same option as above. To bring the compiled application into the second build, we can use the copy command from buildah or the host.
Now, add the default command to the production image.
buildah config --cmd ./app $buildcntr2
Finally, we unmount and commit the image, and optionally clean up the environment:
#unmount & commit the image
buildah unmount $buildcntr2
buildah commit $buildcntr2 multi-stage:latest
#clean up build
buildah rm $buildcntr1 $buildcntr2
Don’t forget that Buildah can also push the image to your desired registry using buildah push`
The beauty of Buildah is that we can continue to leverage the simplicity of the Dockerfile format, but we’re no longer bound by the limitations of it. People do some nasty, nasty things in a Dockerfile to hack everything onto a single line. This can make them hard to read, difficult to maintain, and it's inelegant.
When you combine the power of being able to manipulate images with native Linux tooling from the build host, you are now free to go beyond the Dockerfile commands! This opens up a ton of new possibilities for the content of container images, the security model involved, and the process for building.
A great example of this was explored in one of Tom Sweeney’s blog posts on creating minimal containers. Tom’s example of leveraging the build host’s package manager is a great one, and means we no longer require something like “yum” to be available in the final image.
On the security side, we no longer require access to the Docker socket which is a win for performing builds from Kubernetes/OpenShift. In fairness Buildah currently requires escalated privileges on the host, but soon this will no longer be the case. Finally, on the process side, we can leverage Buildah to augment any existing build process, be it a CI/CD pipeline or building from a Kubernetes cluster to create simple and production-ready images.
Buildah provides all of the primitives needed to take advantage of the simplicity of Dockerfiles combined with the power of native Linux tooling, and is also paving the way to more secure container builds in OpenShift. If you are running Red Hat Enterprise Linux, or possibly an alternative Linux distribution, I highly recommend taking a look at Buildah and maximizing your container build process for production.