Determine and Mitigate Impact of Docker Hub Pull Request Limits starting Nov 2nd
If you are using Docker Hub to distribute your containerized software project, you will by now have received at least two emails about the new image pull consumption tiers. While the initially planned image retention policies (stale images are deleted after 6 months) have been postponed to mid-2021, pull-request limits are starting to be enforced effectively on November 2nd.
How your users are going to be impacted
What this means is that, if you are using the free tier of Docker Hub, all your images will be subject to a pull request limit of 100 pulls per six hours enforced per client IP for anonymous clients. Anonymous clients are all those users, who do not have a Docker Hub account or do not log in via docker login before pulling an image. Anonymous pulls are also very often used in CI/CD systems that build software from popular, public base images.
Pulls from authenticated users on the free tier of Docker Hub are limited to 200 per six hours.
What counts as a pull?
The new limits are enforced on a per-manifest basis. While in the early days of containers one image corresponded to one manifest, in today’s world of multi-arch images a container image is actually a list of manifests, with one manifest/image per supported system architecture (e.g. x86_64, aarch64, arm64v8, etc).
Starting November 2nd, a pull is counted against a single request of single manifest. In case of multi-arch images, most clients however will only download the one manifest that matches the system they are running on, so it would still count as a single pull.
It is important to note however, that a pull is also counted if the client system already has all the image layers present and nothing is actually downloaded. That means that image caching does not reduce the number of pulls counted against the limit.
How to determine if you reached the pull request limit
From a user perspective, since the pull limits are enforced on per client IP, it might be hard to predict if and when limits will be reached. You can however simulate what happens, when that is the case. There are two test repositories available that already have the limits enforced, one of which is permanently at the rate limit. Clients react differently to these.
$ docker pull docker.io/ratelimitalways/test:latest
Error response from daemon: toomanyrequests: Too Many Requests. Please see https://docs.docker.com/docker-hub/download-rate-limit/
This test repository has rate limiting enabled and always in effect. The pull request immediately aborts because the registry returned HTTP 429 (toomanyrequests).
If you are a podman user, the behavior is different:
$ podman pull docker.io/ratelimitalways/test:latest
Trying to pull docker.io/ratelimitalways/test:latest…
This command will initially seem to hang but will return eventually after 15 minutes. With a more verbose log level we can actually see what is going on:
$ podman --log-level debug pull docker.io/ratelimitalways/test:latest
INFO[0000] podman filtering at log level debug
...
[some lines omitted]
...
DEBU[0000] GET https://registry-1.docker.io/v2/ratelimitalways/test/manifests/latest
DEBU[0001] Detected 'Retry-After' header "60"
DEBU[0001] Too many requests to https://registry-1.docker.io/v2/ratelimitalways/test/manifests/latest: sleeping for 60.000000 seconds before next attempt
As you can see, the registry not only returns the “toomanyrequests” HTTP code but also specifies a desired retry interval of 60 seconds via a response header. podman will by default retry 5 times in case of HTTP 429 while respecting the pause duration specified in the “Retry-after” header. After 5 retries it backs off and considers the attempt failed. Above that, podman by default retries failed pulls 3 times, hence the overall duration of 15 minutes.
It eventually fails like the docker client:
DEBU[0977] Error pulling image ref //ratelimitalways/test:latest: Error initializing source docker://ratelimitalways/test:latest: Error reading manifest latest in docker.io/ratelimitalways/test: toomanyrequests: Too Many Requests. Please see https://docs.docker.com/docker-hub/download-rate-limit/
As of time of writing, there is also the ratelimitpreview/test available, which has request counting enabled and supposedly kicking in after the announced limits. However the author could not produce a rate limit being enforced as of yet.
Impact
Assessing the impact will be challenging. Anonymous pulls from Docker Hub are widely used in the FOSS community, especially in CI/CD systems. Almost everybody has image references to public images on Docker Hub in their container platforms and many software build pipelines create containerised software from base images in public repositories.
Container platforms like Kubernetes and OpenShift might run into these limits, when trying to scale or re-schedule a deployment from such an image, even when the nodes have the image cached. These events occur constantly in any container orchestration environment and are very likely to rapidly exhaust the quota of 100/200 pulls in 6 hours, which might cause a service outage. CI/CD pipelines might start to fail building and rolling out your software and those are usually the recovery tool of choice for such outages.
Mitigation strategies
For an enterprise DevOps practice relying on such a critical service via a free-tier offering is usually not acceptable. Especially for on-premise environments the on-going dependency on an online service is not considered a long term solution.
For these environments, enterprise users can leverage Red Hat Quay to provide a scalable and secure container registry platform on top of any supported on- and off-premise infrastructure. It provides massive performance in container image distribution, combined with the ability to scan container image contents for security vulnerabilities, while providing strict multi-tenancy.
Such a deployment is not limited to a single data center or cloud region but can be scaled across the globe using geo-replication. On top of that, content can be copied into a Red Hat Quay instance on a continuous basis from any other container registry via repository mirroring, so you can provide a fast, local cache of public image repositories. For the future we are also planning to have Red Hat Quay run as a transparent proxy cache.
Example of a repository mirroring configuration in Red Hat Quay
On the other end of the spectrum there are customers that do not need their own registry service. And then there are the thousands of volunteers maintaining open source projects and containerized software.
For these audiences there is the online version of Red Hat Quay available at Quay.io. This is a public container registry service that shares the same code base as Red Hat Quay and has a proven track record among the open source community for more than 6 years. In August this year this platform served over 6 billion container image pulls with 100% uptime.
Quay.io not only hosts your container images and serves them to any OCI compatible client (docker, podman, etc) but it can also build your software. It connects to a source code management system of your choice (e.g. GitHub or GitLab) and builds images from your Dockerfile on every commit. At the same time it provides image content scanning, so you can become aware when your published images contain any known security vulnerabilities. This scanning covers a variety of package managers (apt, apk, yum, dnf) and language package managers (python pip) used inside container images.
Overview of the security vulnerabilities found in the official PostgreSQL container images by Red Hat Quay
Another alternative for CI/CD systems is to use a different base image from a different registry, like the Universal Base Image which contains a basic Red Hat Enterprise Linux environment, free to use.
Migrating images with skopeo
In case you want to migrate your existing images to another registry like Quay.io you can leverage skopeo. Like podman and buildah it is part of a toolchain that enables working with containers and images without the need for a docker daemon to be running and without requiring elevated privileges or root access on your OS.
skopeo can be used to easily copy your container images from one registry to another, like so:
$ skopeo login docker.io
Username: dmesser
Password:
Login Succeeded!
$ skopeo login quay.io
Username: dmesser
Password:
Login Succeeded!
$ skopeo sync --src docker --dest docker docker.io/dmesser/nginx quay.io/dmesser/
INFO[0000] Tag presence check imagename=docker.io/dmesser/nginx tagged=false
INFO[0000] Getting tags image=docker.io/dmesser/nginx
INFO[0001] Copying image tag 1/1 from="docker://dmesser/nginx:latest" to="docker://quay.io/dmesser/nginx:latest"
Getting image source signatures
Copying blob bc51dd8edc1b done
Copying blob 66ba67045f57 done
Copying blob bf317aa10aa5 done
Copying config 2073e0bcb6 done
Writing manifest to image destination
Storing signatures
INFO[0012] Synced 1 images from 1 sources
This is all it takes to sync an entire repository called nginx, including all tags, from Docker Hub to Quay.io.
$ podman pull quay.io/dmesser/nginx
Trying to pull quay.io/dmesser/nginx...
Getting image source signatures
Copying blob bf317aa10aa5 done
Copying blob 66ba67045f57 done
Copying blob bc51dd8edc1b done
Copying config 2073e0bcb6 done
Writing manifest to image destination
Storing signatures
2073e0bcb60ee98548d313ead5eacbfe16d9054f8800a32bedd859922a99a6e1
For mass migration of entire repositories skopeo has great facilitates for automation, check out the skopeo-sync documentation. This is suitable for one-off migration as well as regular synchronization of incremental changes as part of a simple cron job.
Notice that by default, Quay.io repositories are private after creation.. You can make them public in the settings menu of the repository. This is a default setting we plan to make configurable in the future.
Quay.io comes with a free tier which does not incur any cost and allows unlimited public container images. Subscription models are available, ranging from developers who need private repositories all the way to offerings suitable for entire organizations or companies, check out the available plans.
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.