This post was inspired by a recent article from CircleCI and a number of conversations I've had since beginning to contribute to Red Hat's OpenShift 3 product. Both in that article and in my conversations, there is often confusion about the motivations for and direction of recent changes in web development. I aim to describe at a high level the how and the why of building, deploying, and maintaining a web application using some emerging concepts like containerization and tools like OpenShift 3 that make the process more efficient, the product more agile and resilient, and the developer happier and more successful.
In order to understand why choosing to use OpenShift to develop a web application will lead to these benefits, I'll explain why the way that developers are deploying their applications is changing to make use of containers, why web application architectures are changing to adopt microservices-based architectures, and why containers natively enable the use of a microservices-based architecture. Finally, I'll show why OpenShift makes it easy for a developer to apply all of this innovation to their product.
Motivations for Change
The architecture of a web application has been rapidly changing since the time when fully static pages without CSS were common. The development of web application architecture has played a monumental role in the advent of virtualization, clustering and server distribution, and cloud computing. As these concepts have gained clout and been integrated into the application deployment process, the web application has changed tremendously for the better as a result: applications became interactive and more dynamic, allowing for a better
user experience; applications also grew more agile and easier to develop and maintain; applications grew less monolithic and therefore more segmented, modular, and therefore more resistant to service disruptions from server issues.
The latest paradigm-shifting change in development architecture is the microservices-based architecture, which aims to decentralize web applications in order to make them scalable, language- and framework-agnostic, and ultimately more suitable for deployment on the web. Microservices are the result of factoring code into small, cohesive modules that each contain a narrow scope and dedicated logic. Microservices follow a "smart endpoints dumb pipes" model where each service can adeptly consume and
transform input, while all services expose language-agnostic APIs so that communication can happen without knowledge of application state or logic whatsoever. Containers are quickly becoming a mature platform on which microservices can be deployed.
The latest paradigm-shifting change in deployment architecture is containerization, which aims to improve the application architecture by allowing an even more efficient method for process isolation. Containerization is the creation of lean sandboxes that can contain a single process and coexist with many other containers on a single host operating system, thereby reducing overhead as compared with common virtualization setups.
Like the other advances in architecture, containerization is aimed at solving a simple problem: the more strongly-coupled an application (be it a web application, an intranet service, etc) is to the physicalities of the world we develop it in, the more difficult it is for this application to meet the ever-changing needs of the environment it is built to serve. Virtualization began to outline a process for isolating software from hardware by abstracting away the concept of a machine and allowing many operating systems to live in concert on a common server. Containerization takes this one step further and enables users to isolate processes from the operating system and each other. Cutting-edge web architecture makes extensive use of containers, so it is vital to understand what they are and how they are implemented in order to fulfill their promises.
The three major strategies that are used to ensure that containers do contain their processes are: the use of Linux concepts like namespaces and control groups for isolating containers from each other and the host, the restriction of container capabilities or permissions to restrict behavior, and integration with pre-existing Linux security features like SELinux or AppArmor.
Container Sandboxing Using Built-In Linux Concepts
Containers are built on Linux machines and can encompass almost any valid Linux process. Containers are therefore able to leverage two Linux kernel features: namespaces to give each container a virtual copy of system resources and control groups (
cgroups) to restrict each container to only use some quota of resources from the host. Furthermore, containers can utilize a practice often used by system administrators (
chroot) to isolate container file systems from each other.
A set of unique kernel namespaces allows the most straightforward and effective isolation of containers from each other and the host system. Kernel namespaces are an abstraction similar to virtual memory, allowing each process in the namespace to see system resources as existing solely for them. Each container furthermore gets its own network stack, not allowing it have any privileged access to sockets or interfaces of other containers, and allowing containers to interact with each other through their network interfaces just as they would with remote hosts, if their host system is configured to allow it.
Membership in a control group allows for the protection of other containers and the host from a given container as well, but in an entirely different way. Linux control groups allow for the accounting and restriction of resource usage for member processes in a group. By limiting resource usage, a container is not directly limited from affecting the data of another container or the host, but it is directly stopped from being a vector for a denial of service (DoS) attack. Control groups are therefore vital for multi-tenant environments where one host can have many tenant containers.
Just Enough Capabilities to Be Sufficient
Traditional UNIX implementations of permission checking allowed for two levels of privilege: all or nothing. The privileged or superuser processes were allowed to bypass every single kernel permission check, while unprivileged processes were subject to all checks. Modern Linux allows for the concept of capabilities: toggle-able settings for each superuser privilege. For instance, this will allow a process the right to open a websocket, without having the right to load a new kernel driver. This granularity allows for the precise control of access for processes, thereby making it imperative to set capabilities thoughtfully in order to maximize safety. Containers are spawned with a very limited set of capabilities, and are only granted further powers if the administrator deems it necessary for a container to have more permissions to function. Containers are not allowed to set their own level of privilege, allowing container creators to give them just the right level of capability to make them useful for their process but useless for an attacker.
Containers are Not Special in the Eyes of Security Policies
Modern Linux distributions are shipped with either Security-Enhanced Linux (SELinux) on RPM distributions and AppArmor on Debian distributions. These packages modify the kernel to allow access control for subsystems of the kernel, allowing for the confinement of processes and reduction in possible damage by any given process to the system. SELinux is an integral component of the Linux security story and the widespread adoption of Linux for servers handling sensitive information can be directly attributed to the inclusion of SELinux in the kernel. It is therefore vital that containers are not seen as special in the eyes of these security policies, so that all of the security that comes with using a Linux machine is present with their use. This security is added on top of the special security rules baked into containers by design, further maturing containers as a responsible choice for application deployment.
Containerization Enables Microservices
Microservices as an architecture pattern are gaining traction very quickly for all web applications, although large-scale applications were adopting the pattern as early as 2006.
In a microservices-based architecture, modules
of logic are functionally decomposed into a set of collaborating services. Each service is responsible for a narrow set of functions and exposes a language-agnostic synchronous HTTP/RESTful interface or asynchronous interface in order to interact with other services in the application. The fulfillment of a request or user interaction involves the propagation of requests from one service to another.
Architecture and Benefits
The microservices-based architecture is in stark contrast to monolithic applications. A monolithic application may be split into three canonical parts - the client-facing interface, the server-side application, and a relational database or other persistent storage module. In contrast, a microservices-based application may be composed of hundreds of modules built around related business capabilities and fully independent, tied loosely through a language-agnostic communication API.
There are many benefits to the microservices-based architecture. Microservices are small, and therefore easier for a developer to understand. The microservices are also independent of each other - their RESTful API remains unchanged but the implementation is irrelevant and can therefore be changed without causing side-effects, enabling fast development and more agility for the development team to react to external pressures. The loose coupling of these services through common interfaces furthermore allows
them to be easily scalable and therefore well suited for web-scale deployment. This coupling also allows for resilience to faults in the codebase as well as faults in the physical servers running the application - when request fulfillment is agnostic to which replica of a service is used, server downtime or unhealthy code state in one service replica can be remedied by deploying another. As the API used for communication between services is not dependent on any specific technology stack, this implementation also allows for agility in choosing frameworks.
Implementation Using Containers
The parallels between a large, loosely-coupled group of services that interact through HTTP requests and a conglomerate of containers which are decoupled by definition and can only refer to each other as remote hosts are obvious. Unfortunately, implementing a cohort of containers to fulfill a product as microservices does takes work and is not as easy as downloading Docker and writing some configuration files. Docker's own support for multi-container applications is only moving from alpha releases to beta releases as this post is written. Therefore, other tools must be used to orchestrate containers. While tools like Amazon Web Services Elastic Beanstalk support these type of environments, they do not begin to define what it means to be a service, how containers should be built and deployed in order to fulfill a service, etc. Red Hat's OpenShift 3 aims to be the tool that makes this process streamlined and painless.
Many of the downsides to building a microservices-based application can be greatly alleviated with the use of containerization. For instance, if deployment of your application is done using a cloud environment, small services may not need enough resources to warrant getting their own node on the cloud, but will need one nevertheless to ensure segregation between services, leading to unused resource overhead to this architecture. If the service were containerized, however, multiple services could be co-located on the same host without reducing the degree to which they are isolated from each other.
Furthermore, as different languages and frameworks have distinct advantages in specific situations, different languages may be used for the implementation of different services for a web application. Configuring a server to run the specific frameworks needed for a service, while configuring other servers differently can quickly become a logistical nightmare of dependencies and book-keeping. Containerizing services allows for each container to keep track of necessary requirements and package them together with the application code, allowing for immeasurably easier deployments.
Using OpenShift 3
Red Hat's OpenShift 3 platform-as-a-service aims to be a platform on which developers can build web application source code into Docker images that represent their underlying services and then deploy these images onto a server or cloud provider to create a highly available and resilient application. By internalizing Google Cloud Platform's Kubernetes software, OpenShift is able to harness the power of a sophisticated container orchestration platform. The concept of a service is well formed: at a high level, a service wraps around any number of replicas of a container running the business logic to fulfill the service requests. The service is governed by a replication controller which is able to ensure that specific number of pods, which embed Docker containers, are kept running at all times - were service outages to happen, the replication controller will ensure that other pods are created to replace those that become unhealthy. The service is under control of an external load balancer and auto-scaler (e.g. HAProxy) that informs the replication controller when the number of containers fulfilling the given service needs to change to adapt to network traffic.
What Kubernetes does not support out-of-the-box, however, are tools aimed at developers to ensure that workflow is as streamlined as possible. OpenShift exposes a friendly interface to developers, defining basic objects and allowing them to be composed in complex ways in order to give developers maximum flexibility with the path from development to production. The basic pipeline starts with source code, changes to which trigger builds to images, which feed into a deployment which results in containers fulfilling the services necessary to run the application and expose it to the web.
Pre-configured pipelines exist as well, allowing developers to traverse the entire path from source code to Docker image to deployed Kubernetes service without focusing much on any one step, giving them more time to focus on their codebase. OpenShift exposes the ability to define deployments, the strategies that fulfill them, and the events that can trigger them. OpenShift furthermore allows for rollbacks in deployments as well as as it supports rolling deployments in order to facilitate a dynamic application that needs to have high availability without impacting development agility. With any number of complex custom configurations, OpenShift can automate the entire road to a production application - from source code to quality assurance steps to staging and finally to production builds and deployments.
A distinguishing feature for OpenShift when compared to other Docker container orchestration schemes or platforms-as-a-service is the ability to do a fully automated build from source code as well as builds from Dockerfiles. The source-to-image (S2I) build in OpenShift can be linked to any GitHub repository and will build a Docker image from the code, with triggers on code changes as configured by the OpenShift user.
Source-to-image has a number of advantages. S2I builds can layer application code to almost any existing Docker image, allowing maximum reuse of images from the current ecosystem and promoting collaboration while reducing workloads. S2I also is more efficient and more secure by restricting the allowed actions during a build, so that arbitrary commands allowed in Dockerfiles (with root access to the building machine) are not allowed, securing the host from malicious or poorly-written commands.
OpenShift also allows for a custom build configuration, where as part of the build a pod is created whatever custom code necessary is run on that pod from the image provided for it. This allows a build to issue external triggers for other systems such as continuous integration pipelines or other code review tasks or even internal messaging or logging systems. The image for this builder pod will also provide the logic necessary to build the resulting files, whether they be Docker images or other formats.
As OpenShift builds integrate the provided application source code with the necessary base images, whether they are provided locally or are an official library or runtime on a public image hub, changes in dependencies will trigger builds as well. In this way, OpenShift keeps your dependencies in check and allows for up-to-date and consistent image output.
OpenShift provides a clean interface with which developers can control how changes in the application image or deployment configuration translate to changes in the deployment of containers on the cluster that runs the application. The basic structure of a deployment in OpenShift begins with a deployment strategy - two that are provided are the recreate strategy and the rolling strategy.
The recreate strategy recreates the current deployment with the new images, then finds and disables any previous deployments that exist. The rolling deployment is similar, with one caveat - the new deployment is rolled out gradually, while the current deployments are scaled down at the same time, ensuring that the total amount of throughput to the application remains relatively constant while the new deployment replaces the old.
A custom deployment is also possible. This clean interface to the container orchestration tools of Kubernetes allows OpenShift users to navigate complicated pod changes quickly and without manual input. Custom code and other actions can be configured to happen before and after deployments for further support for development pipelines.
In order to support dynamic production environments with short development timeframes and rapid pushes to production images, OpenShift supports canary deployments wherein the first pod to be deployed from a new image is probed for health and failure of that pod cancels the current deployment to ensure that the application has maximum availability. If, however, a new deployment still functions but the decision is made that the build is not yet ready or must be reverted, OpenShift supports a deployment rollback to a valid previous state, allowing for the application to take on a previous version cleanly.
Automating the Road to Production
The ability to define custom actions at almost every step in the OpenShift pipeline from source code to production deployment allows for even the most complicated development processes to be supported. For instance, a process that begins with multiple branches of a GitHub repository holding application code in the development, quality assurance, and production stages can be fed into separate pipelines for building. Each branch can be built individually, potentially with different steps taken for each branch, and the resulting images can trigger deployments with separate configurations for each environment. If one branch or code repository served as the source for the entire codebase, it could follow a build process and the resulting images could be tagged
:prod, etc., allowing deployments to trigger on updates of any images with a specific tag. Furthermore, other Docker images could be the input to the entire process, if an automated Docker building step is already present in a pre-existing development environment.
Building a web application in 2015 may seem like a daunting task. Paradigms and frameworks that were commonplace and exciting only a few years ago are no longer as popular and change seems to be constant. Some transitions, however, are less ephemeral than others.
The change from a monolithic architecture to a microservice-based architecture has been adopted by most large web companies and continues to be battle-tested every single day. Considering a restructuring of an existing application to use it or beginning the development of a new web application using this architecture will not be a choice to be regretted.
Similarly, the prevalence of containers is on the rise as they allow for developers to have a more efficient path to deployment. Containerization builds on the success of virtualization and allows for better utilization of cloud environments; the level to which containers can integrate with existing deployment strategies is a good indicator that it will be a good choice for future development tasks. Using Red Hat's OpenShift 3 platform allows for a clear path to integrating the containerization process and microservice architecture to your project: consider using OpenShift 3 to bring these innovations to your project so you can reap their benefits.