피드 구독

The concept of distroless is a popular idea in the world of containers. The idea is to package applications in container images while at the same time removing as much of the operating system as possible (package managers, libraries, shells, etc). This does provide some security benefits, but these benefits are often blown out of proportion because of a naive understanding of what an operating system is, how it works, and in particular what a Linux distribution is and how they work.

This article will try to give a clearer understanding of the actual benefits of Distroless while at the same time tempering the over-hyped marketing of Distroless. Let’s explore some fallacies.

 

2023 Gartner® Magic Quadrant™에서 리더로 선정된 Red Hat

Red Hat은 Gartner 2023 Magic Quadrant 컨테이너 관리 부문의 실행 능력 및 비전의 완성도에서 최고점을 획득했습니다.

Fallacy #1: Size is The Most Important to Attack Surface

Pratyusa K. Manadhata from Carnegie Mellon has an elegantly simple definition. In his paper An Attack Surface Metric, he states that “a system’s attack surface is the set of ways in which an adversary can enter the system and potentially cause damage.”  But, how does this translate to containers and container images?

Often, the attack surface of a container image is measured by the number of files in it, or how many megabytes of space it uses on disk. These measurements are naive proxies for the actual attack surface. To truly understand attack surface, a security analyst must understand several things:

  1. Not all files in a container image contribute to attack surface equally

  2. Files which are directly in the execution path (web servers, C libraries, etc) are more likely to expand the attack surface than files that don’t (shells, config files, etc.)

  3. The quality of software and configuration(aka files) in the direct execution path contribute to attack surface more than the size of the container image or the number of files contained in the image.

  4. The software in the direct execution path which is used in many different container images (C libraries, web servers, encryption libraries, etc) contributes a larger share to the attack surface.

  5. Standardizing on the exact same versions (Linux distribution and version) of this widely used software in the direct execution path (C libraries, web servers, encryption libraries, etc) reduces attack surface, and can make compliance and remediation easier.

Stated another way, standardization and quality of the software in your direct execution path lowers your attack surface more than distroless does. If everything can’t be run with the exact same distroless images, you will not benefit much from distroless.

Fallacy #2 You Can Actually Remove the Operating System from a Container Image

Much like cloud, there is no such thing as distroless, just somebody else’s Linux distro. So-called "distroless" container images are typically very slimmed down user space environments without package managers, shells or other apps you might find in a typical distribution. That's it.

An operating system, and more specifically, a Linux distro is made up of two main components: a kernel, and a user space. A kernel is fairly easy to understand, it’s a special program that runs on the hardware or virtual machine. The user space is a bit harder to understand. The user space includes everything you can see in a container image, for example, things like the C library (glibc, or muslc), web servers, encryption libraries, timezone data, locale data (language and cultural), etc.

The user space can’t truly be  removed from a distroless container image, and a distroless image must always run on a kernel.

That’s right, even Google’s distroless project relies on Debian for user space requirements. (See this request to the distroless project asking for them to rebase to the latest Debian Bullseye.) 

Even in a distroless set of container images, things like the Java virtual machine, Python, and Node.js are compiled against a C library which gives these user space programs access to low level functions in the Linux kernel (network sockets, storage volumes, files, etc). 

Linux distros exist because we just want to write applications, we don’t want to patch and maintain widely used libraries and infrastructure like timezone data, C libraries, etc. It’s completely logical for projects like distroless to rely on a Linux distro to provide this infrastructure. 

Don’t trick yourself into thinking that you’re reducing your attack surface with distroless alone. It’s a wash. There's still a distro in "distroless" containers, just less of one. See also: Do Linux distributions still matter with containers?

Fallacy #3: Hackers Can Break into Your Containers Using All of These Files Just Sitting There

This one makes me laugh. Most of the excess size in container images comes from spurious files that do nothing. Things like man pages, timezone data, locale data, debug binary data, etc. 

Ask yourself, how would a hacker use excess timezone data, provided by a Linux distro, to break into your container (not to be confused with a hacker sneaking bad time zone data in and confusing glibc. Notice, glibc)? While I’m not advocating for larger container images, it’s still a waste of space. However it’s difficult to argue that most of the bloat into container images is actually usable by attackers. I'm more concerned that a piece of Ruby (or whatever, no offense to Ruby) code copied/pasted into an application will create a CVE than a piece of timezone data.

Contrary to what we all think about bloat (it’s annoying), an attacker will likely just bring the tools they need with them. It’s quite common for attackers to use stack overflows and other exploits to write their own shells into memory. Once they have a shell, they can literally bring a toolkit with them. They probably don’t want to use the tools on the system, they want to use their own like a plumber who brings their own toolbox instead of using your tools in the garage.

Instead, think about the bloat across the fleet of cloud servers, network devices, routers, switches, as well as all of the different web servers (httpd, nginx, golang’s built-in one, as well as many scripting languages which have modules which implement http directly in the language). Think about this entire set of software across the entire environment. From encryption modules to TCP stacks, to implementations of the http protocol (aka web servers). 

This is the attack surface for your applications and infrastructure. All of this code is functional and in the direct execution path from the outside of your organization when a web request is served.

When you think about attack surface, the right way, the bloat in a container image seems like the least of your worries. I’m not saying it doesn’t matter, it does, but each different piece of software deployed in an environment is a new permutation. It’s the number of permutations that you should worry about first.  

Red Hat Universal Base Image Micro (UBI Micro)

You might be asking yourself, why did Red Hat release UBI Micro if distroless isn’t that big of a deal? Well, once you’ve standardized on a single glibc, a single OpenSSL library, and a single nginx or Apache web server version, the next optimization is indeed trimming the size of the individual services down. 

Once you’ve standardized on UBI as a whole, which are RHEL packages, now it’s time to trim the individual container images down. UBI Micro helps with that. UBI Micro is built from the same glibc as RHEL. 

When you add OpenSSL to UBI Micro, it’s the same package from RHEL. When you add httpd to UBI Micro, it’s the same Apache from RHEL. This should form a pretty clear picture. When you use UBI Micro, you’re not pulling yet another Linux distro into your environment, and thereby increasing your attack surface, you really are reducing it.

For more information on UBI micro, check out: Introduction to Red Hat's UBI Micro.


저자 소개

At Red Hat, Scott McCarty is Senior Principal Product Manager for RHEL Server, arguably the largest open source software business in the world. Focus areas include cloud, containers, workload expansion, and automation. Working closely with customers, partners, engineering teams, sales, marketing, other product teams, and even in the community, he combines personal experience with customer and partner feedback to enhance and tailor strategic capabilities in Red Hat Enterprise Linux.

McCarty is a social media start-up veteran, an e-commerce old timer, and a weathered government research technologist, with experience across a variety of companies and organizations, from seven person startups to 20,000 employee technology companies. This has culminated in a unique perspective on open source software development, delivery, and maintenance.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리