I have an interest in application deployment systems, having created three different application bunding system (1,2, 3). These were all experiments in the area of desktop applications, but recently there has been some interesting motions in related areas, namely Docker.
Docker is a server application/container deployment system, which nicely sidesteps a lot of the complexity with desktop apps (not having to integrate deeply with the desktop) which makes it a lot easier to deploy. Additionally, docker is more than a deployment system, it also has some interesting ideas about how to create and distribute applications.
Every docker container is a copy-on-write clone of a specific parent Image, which means instantiation of docker containers is very fast and cheap. It also gives some very interesting properties because you can track the changes in a container (compared to t its parent image) and “commit” this to a new image. This creates a git-like hierarchy of images where every commit is a filesystem layer that applies to a previous layer, up to some base image. The git-like workflow is really nice to work with when creating images, and the final result is very easy to share and deploy, and at the same time automatically shares as much as possible with common base images.
Unfortunately Docker relies on AUFS, a union filesystem that is not in the upstream kernel, nor is it likely to ever be there. Also, while AUFS is in the current Ubuntu kernel it is deprecated there and will eventually be removed. This means Docker doesn’t run on Fedora which has a primarily-upstream approach to packaging.
So, the last month or so I’ve been working on making Docker work in Fedora (and thus eventually in RHEL, which is the nr 1 requested Docker feature). Of course, this work will benefit other distributions that don’t have AUFS too.
I started looking at possible replacements for the copy-on-write support, and there are a few possibilities availible:
-
overlayfs
-
btrfs
-
lvm snapshots
-
lvm thin provisioning
Overlayfs is a different union filesystem implementation than AUFS, and the one that seems most likely to land upstream. But that is happening slowly, if at all. Long-term I think this is the best option, but right now it is out of the question.
Btrfs has copy-on-write both using filesystem snapshots and one a file basis using reflink. However, btrfs is not currently used much in production as its not considered stable enough. It would also be a very heavy dependency for Docker, as may users would have to reformat their disks to use it.
Lvm snapshots are useful for doing e.g. backup of a snapshot, but regress badly in performance when you start having many snapshots of the same device.
This leaves us with only lvm thin provisioning. This is a fairly recent, but relatively stable technology that allows you to create copy-on-write block devices that are “thinly” provisioned, meaning they don’t use real space until the device is in use. This is not ideal for Docker as it really wants copy-on-write at the file level, but with some work it is possible to work around this.
Rather than interacting with lvm which is a very generic volume manager I chose to use the lower level device-mapper kernel APIs directly (via libdevmapper). This allows us greater ease of access to the devices programmatically, as well as avoiding confusion with possible system use of LVM. Also it avoids some LVM performance issues with very many devices.
So, we set up a single large block device on which we create a device-mapper “thinp” pool. On this we then creates a single “base” block device formated with ext4. Every image and container are then created as snapshot (in multiple steps) from this base device. So, say you’re starting a container based on an image “apache” which itself is based on a “fedora” image, we would:
-
Create a snapshot of the base device.
-
Mount it and apply the changes in the fedora image.
-
Create a snapshot based on the fedora device.
-
Mount it and apply the changes in the apache image.
-
Create a snapshot based on the apache device.
-
Mount it and use as the root in the new container.
And of course, these devices will be reused (with corresponding steps skipped) as needed by other images/containers.
The devicemapper pool need to be set up on a large block device that fits all the images and containers that you will be used which would be painful for most people. Docker handles this by automatically creating the a large sparse file, using it as a loopback device for the devicemapper work. Additionally we ensure that DISCARD support is enabled in the filesystem so that any files removed in the conttainer filters down to the loopback file making it sparse again.
This means that there is no need for setup, and space for images and containers will only be used as needed. Of course, there are still issues, like the max size of the loopback mount (100G by default, but this should be easy to grow) and the max size of the base extt4 image (10G by default, resizing is harder after initial construction, but should be possible).
We’re currently in the process of landing this in Docker, and hope to have a 0.7 release out based on my device-mapper work pretty soon. Then I will continue working on making docker a first-class citizen on Fedora.
Sobre o autor
Alexander has worked at Red Hat since 2001, doing development work on desktop and containers, including the creation of flatpak and lots foundational work on Gnome.
Navegue por canal
Automação
Últimas novidades em automação de TI para empresas de tecnologia, equipes e ambientes
Inteligência artificial
Descubra as atualizações nas plataformas que proporcionam aos clientes executar suas cargas de trabalho de IA em qualquer ambiente
Nuvem híbrida aberta
Veja como construímos um futuro mais flexível com a nuvem híbrida
Segurança
Veja as últimas novidades sobre como reduzimos riscos em ambientes e tecnologias
Edge computing
Saiba quais são as atualizações nas plataformas que simplificam as operações na borda
Infraestrutura
Saiba o que há de mais recente na plataforma Linux empresarial líder mundial
Aplicações
Conheça nossas soluções desenvolvidas para ajudar você a superar os desafios mais complexos de aplicações
Programas originais
Veja as histórias divertidas de criadores e líderes em tecnologia empresarial
Produtos
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Red Hat Cloud Services
- Veja todos os produtos
Ferramentas
- Treinamento e certificação
- Minha conta
- Suporte ao cliente
- Recursos para desenvolvedores
- Encontre um parceiro
- Red Hat Ecosystem Catalog
- Calculadora de valor Red Hat
- Documentação
Experimente, compre, venda
Comunicação
- Contate o setor de vendas
- Fale com o Atendimento ao Cliente
- Contate o setor de treinamento
- Redes sociais
Sobre a Red Hat
A Red Hat é a líder mundial em soluções empresariais open source como Linux, nuvem, containers e Kubernetes. Fornecemos soluções robustas que facilitam o trabalho em diversas plataformas e ambientes, do datacenter principal até a borda da rede.
Selecione um idioma
Red Hat legal and privacy links
- Sobre a Red Hat
- Oportunidades de emprego
- Eventos
- Escritórios
- Fale com a Red Hat
- Blog da Red Hat
- Diversidade, equidade e inclusão
- Cool Stuff Store
- Red Hat Summit