피드 구독

Containers have grabbed so much attention because they demonstrated a way to solve the software packaging problem that the IT industry had been poking and prodding at for a very long time. Linux package management, application virtualization (in all its myriad forms), and virtual machines had all taken cuts at making it easier to bundle and install software along with its dependencies. But it was the container image format and runtime that is now standardized under the Open Container Initiative (OCI) that made real headway toward making applications portable across different systems and environments.

Containers have also both benefited from and helped reinforce the shift toward cloud-native application patterns such as microservices. However, because the most purist approaches to cloud-native architectures de-emphasized stateful applications, the benefits of containers for storage portability haven’t received as much attention. That’s an oversight. Because it turns out that the ability to persist storage and make it portable matters, especially in the hybrid cloud environments spanning public clouds, private clouds, and traditional IT that are increasingly the norm.

Data gravity

One important reason that data portability matters is “data gravity,” a term coined by Dave McCrory. He’s since fleshed it out in more detail, but the basic concept is pretty simple. Because of network bandwidth limits, latency, costs, and other considerations, data “wants” to be near the applications analyzing, transforming, or otherwise working on it. This is a familiar idea in computer science. Non-Uniform Memory Access (NUMA) architectures—which describes pretty much all computer systems today to a greater or lesser degree—have similarly had to manage the physical locality of memory relative to the processors accessing that memory.

Likewise, especially for applications that need fast access to data or that need to operate on large data sets, you need to think about where the data is sitting relative to the application using that data. And, if you decide to move an application from on-premise to a public cloud for rapid scalability or other reasons, you may find you need to move the data as well.

Software-defined storage

But moving data runs into some roadblocks. Networking limits and costs were and are one limitation; they’re a big part of data gravity in the first place. However, traditional proprietary data storage imposed its own restrictions. You can’t just fire up a storage array at an arbitrary public cloud provider to match the one in your own datacenter.

Enter software-defined storage.

As the name implies, software-defined storage decouples storage software from hardware. It lets you abstract and pool storage capacity across on-premise and cloud environments to scale independently of specific hardware components. Fundamentally, traditional storage was built for applications developed in the 1970s and 1980s. Software-defined storage is geared to support the applications of today and tomorrow, applications that look and behave nothing like the applications of the past. Among these are rapid scalability, especially for high volume unstructured data that may need to expand rapidly.

However, with respect to data portability specifically, one of the biggest benefits of software-defined storage like Gluster is that the storage software itself runs on generic industry standard hardware and virtualized infrastructure. This means that you can spin up storage wherever it makes the most sense for reasons of cost, performance, or flexibility.

Containerizing the storage

What remains is to simplify the deployment of persistent software-defined storage. It turns out that containers are the answer to this as well. In effect, storage can be treated just like a containerized application within a Kubernetes cluster—Kubernetes being the orchestration tool that groups containerized application components into a complete application.

With this approach, storage containers are deployed alongside other containers within the Kubernetes nodes. Rather than simply accessing ephemeral storage from within the container, this model deploys storage in its own containers, alongside the containerized application. For example, storage containers can implement a Red Hat Gluster Storage brick to create a highly-available GlusterFS volume that handles the storage resources present on each server node.

Depending on system configuration, some nodes might only run storage containers, some might only run containerized applications, and some nodes might run a mixture of both. Using Kubernetes with its support for persistent storage as the overall coordination tool, additional storage containers could be easily started to accommodate storage demand, or to recover from a failed node. For instance, Kubernetes might start additional containerized web servers in response to demand or load, but might restart both application and storage containers in the event of a hardware failure.

Kubernetes manages this through Persistent Volumes (PV). PV is a resource in the cluster just like a node is a cluster resource. PVs are a plugin related to the Kubernetes Volumes abstraction, but have a lifecycle independent of any individual pod that uses the PV. This allows for clustered storage that doesn’t depend on the availability or health of any specific application container.

Modular apps plus data

The emerging model for cloud-native application designs is one in which components communicate through well-documented interfaces. Whether or not a given project adopts a pure “microservices” approach, applications are generally becoming more modular and service-oriented. Dependencies are explicitly declared and isolated. Scaling is horizontal. Configurations are stored in the environment. Components can be started quickly and gracefully shutdown.

Containers are a great match with these cloud-native requirements. However, they’re also a great match for providing persistent storage for hybrid cloud-native applications. Not all cloud apps are stateless or can depend on a fixed backing store that’s somewhere else. Data has gravity and it matters where the data lives.


저자 소개

Gordon Haff is a technology evangelist and has been at Red Hat for more than 10 years. Prior to Red Hat, as an IT industry analyst, Gordon wrote hundreds of research notes, was frequently quoted in publications such as The New York Times on a wide range of IT topics, and advised clients on product and marketing strategies.

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리