Standards are powerful forces in the software industry. They can drive technology forward by bringing together the combined efforts of multiple developers, different communities and even competing vendors. Over the past year there has been no greater example of this than the evolution of Linux containers technology driven by the Docker open source project. In OpenShift, which is built on containers technology, we recognized early on the power of this emerging standard and today the weight of Red Hat is squarely behind it.
Driving a Standard for Linux Containers
Container technology is far from new, with a heritage that goes back almost a decade in Linux and longer still in Unix. However, usage of Linux containers was limited to select users and organizations that understood the power of the technology and had the sophistication to harness it. The advent of Platform-as-a-Service (PaaS) saw multiple PaaS vendors adopt containers, including OpenShift, Google AppEngine, Heroku, and more. In fact, it was from the PaaS space that the Docker technology first emerged.
Today, hundreds of developers contribute to the Docker community project, as do multiple vendors, and their combined efforts have brought Linux containers into the mainstream. The latest evidence of this momentum came from Microsoft, who recently announced that they too would join the cause in an effort to bring containerization to Windows.
While a few PaaS vendors like Pivotal are still pushing their own platform-specific container management solution, we at Red Hat were early proponents of Docker and quickly became one of the leading contributors to the community project. This enables us to standardize Linux containers across our own solutions, including Red Hat Enterprise Linux, OpenShift, Red Hat Enterprise Linux Atomic Host and more, even as we help drive a standard for Linux containers in the industry.
Kubernetes as a Standard for Container Orchestration & Management
As I discussed in my prior blog post however, applications in OpenShift typically span multiple containers and those containers need to be deployed across multiple hosts. One of the key requirements for OpenShift is a system for orchestrating and managing these containers at very large scales. With more than two million containerized applications deployed on our own OpenShift Online service since inception, we’ve gained quite a bit of experience managing containers at scale. We tapped into that experience when we set out to build our next generation container orchestration capabilities for OpenShift v3 and launched efforts like the GearD project.
Google also knows a little something about web scale container orchestration, as containers power most of their services. When Google notified us of their intent to launch the Kuberenetes project for container orchestration and management, we saw the opportunity to collaborate and drive another standard to propel containers technology forward.
Today you will find multiple Red Hat developers among the leading contributors to Kubernetes, just as we’ve done in Docker. We’ve taken our initial work on container orchestration in GearD and our experience from running OpenShift over the past 4 years and are using that to help drive capabilities in Kubernetes, together with Google and other contributors.
Red Hat’s Clayton Coleman, who initiated the GearD project, is now one of the leading contributors to Kubernetes and has worked with other OpenShift developers to integrate our initial GearD orchestration efforts upstream. Other contributors have shown up in great numbers to support the project as well as vendors like IBM, Microsoft, and others.
Kubernetes relies on Docker to package, instantiate, and run application containers. The power of Kubernetes is in the declarative model it implements for managing containerized application deployments. Rather than declaring what containers to deploy where (as in an “imperative model”), a user declares the desired end state that should be maintained. With this declared state established, Kubernetes can then employ automated self-healing mechanisms such as automatically restarting containers, rescheduling containers on different hosts, or replicating containers for use cases such as auto-scaling. For example, if your application server cluster should have four server instances each running in their own container, then that state can be declared and Kubernetes will deploy and maintain that state over time, starting or restarting containers as needed.
How Kubernetes Works
In Kubernetes, Linux host instances referred to as “masters” are used to manage and schedule container deployments as well as manage state. This has historically been the function of the broker tier in OpenShift. The Kubernetes master uses the etcd repository for storing state, although it may be possible to use alternative repositories in the future. The master also contains the scheduler for handling placement of pods onto selected hosts and this scheduler is also pluggable with 3rd party schedulers/cluster managers. The master then provides different controllers, such as the replication controller, and also an authenticated API server to interface with clients.
The Kubernetes node instances (previously referred to as minions) are where the containers actually run. A node agent, or kubelet, manages the containers on each host, to maintain the desired state described by the master. Kubernetes deploys containers in pods, with each pod including one or more related containers. The containers in each pod share an IP address and data volumes and run on the same host. Pods may only have a single container, but an example of a multi-container pod may be a database (i.e. Postgresql) plus admin tool (pgadmin) or an application server plus management agent.
Labels are used to identify and group related pods across hosts. A service in Kubernetes is a logical set of pods (identified by a service label) which can be accessed as a unit, such as a database service that may consist of multiple database instances (each in their own pod). This facilitates the deployment of microservices that span multiple containers and a service proxy is used to proxy requests across those containers.
We also see the Kubernetes project driving innovation across adjacent communities, like Apache Mesos and Apache Hadoop Yarn in the scheduler and cluster management space. Both of these communities are integrating their solutions with Kubernetes, bringing large scale cluster management capabilities to application services. This was highlighted recently in a blog by Hortonworks who discussed the work they are doing to integrate Yarn with Kuberenetes and OpenShift v3.
While multiple solutions for container orchestration continue to sprout in the Docker community, as well as others developing their own platform-specific orchestration and scheduling solutions (e.g. Cloud Foundry Diego), we believe over time the power of this emerging standard will win out.
Kubernetes is the key component for managing and ensuring the reliability of application deployments in OpenShift v3. It not only orchestrates and schedules container deployments across a cluster of server hosts, but it also automates container health management. Leveraging a declarative model and automated controllers, Kubernetes brings the power of web scale container orchestration and management to OpenShift users.
If you want to play with Kubernetes in our latest upstream Origin builds, check out our recent OpenShift v3 deep dive blog. In my next blog I will discuss how we are extending Kubernetes to provide additional functionality in OpenShift v3.