Containers have grabbed so much attention because they demonstrated a way to solve the software packaging problem that the IT industry had been poking and prodding at for a very long time. Linux package management, application virtualization (in all its myriad forms), and virtual machines had all taken cuts at making it easier to bundle and install software along with its dependencies. But it was the container image format and runtime that is now standardized under the Open Container Initiative (OCI) that made real headway toward making applications portable across different systems and environments.
Containers have also both benefited from and helped reinforce the shift toward cloud-native application patterns such as microservices. However, because the most purist approaches to cloud-native architectures de-emphasized stateful applications, the benefits of containers for storage portability haven’t received as much attention. That’s an oversight. Because it turns out that the ability to persist storage and make it portable matters, especially in the hybrid cloud environments spanning public clouds, private clouds, and traditional IT that are increasingly the norm.
Data gravity
One important reason that data portability matters is “data gravity,” a term coined by Dave McCrory. He’s since fleshed it out in more detail, but the basic concept is pretty simple. Because of network bandwidth limits, latency, costs, and other considerations, data “wants” to be near the applications analyzing, transforming, or otherwise working on it. This is a familiar idea in computer science. Non-Uniform Memory Access (NUMA) architectures—which describes pretty much all computer systems today to a greater or lesser degree—have similarly had to manage the physical locality of memory relative to the processors accessing that memory.
Likewise, especially for applications that need fast access to data or that need to operate on large data sets, you need to think about where the data is sitting relative to the application using that data. And, if you decide to move an application from on-premise to a public cloud for rapid scalability or other reasons, you may find you need to move the data as well.
Software-defined storage
But moving data runs into some roadblocks. Networking limits and costs were and are one limitation; they’re a big part of data gravity in the first place. However, traditional proprietary data storage imposed its own restrictions. You can’t just fire up a storage array at an arbitrary public cloud provider to match the one in your own datacenter.
Enter software-defined storage.
As the name implies, software-defined storage decouples storage software from hardware. It lets you abstract and pool storage capacity across on-premise and cloud environments to scale independently of specific hardware components. Fundamentally, traditional storage was built for applications developed in the 1970s and 1980s. Software-defined storage is geared to support the applications of today and tomorrow, applications that look and behave nothing like the applications of the past. Among these are rapid scalability, especially for high volume unstructured data that may need to expand rapidly.
However, with respect to data portability specifically, one of the biggest benefits of software-defined storage like Gluster is that the storage software itself runs on generic industry standard hardware and virtualized infrastructure. This means that you can spin up storage wherever it makes the most sense for reasons of cost, performance, or flexibility.
Containerizing the storage
What remains is to simplify the deployment of persistent software-defined storage. It turns out that containers are the answer to this as well. In effect, storage can be treated just like a containerized application within a Kubernetes cluster—Kubernetes being the orchestration tool that groups containerized application components into a complete application.
With this approach, storage containers are deployed alongside other containers within the Kubernetes nodes. Rather than simply accessing ephemeral storage from within the container, this model deploys storage in its own containers, alongside the containerized application. For example, storage containers can implement a Red Hat Gluster Storage brick to create a highly-available GlusterFS volume that handles the storage resources present on each server node.
Depending on system configuration, some nodes might only run storage containers, some might only run containerized applications, and some nodes might run a mixture of both. Using Kubernetes with its support for persistent storage as the overall coordination tool, additional storage containers could be easily started to accommodate storage demand, or to recover from a failed node. For instance, Kubernetes might start additional containerized web servers in response to demand or load, but might restart both application and storage containers in the event of a hardware failure.
Kubernetes manages this through Persistent Volumes (PV). PV is a resource in the cluster just like a node is a cluster resource. PVs are a plugin related to the Kubernetes Volumes abstraction, but have a lifecycle independent of any individual pod that uses the PV. This allows for clustered storage that doesn’t depend on the availability or health of any specific application container.
Modular apps plus data
The emerging model for cloud-native application designs is one in which components communicate through well-documented interfaces. Whether or not a given project adopts a pure “microservices” approach, applications are generally becoming more modular and service-oriented. Dependencies are explicitly declared and isolated. Scaling is horizontal. Configurations are stored in the environment. Components can be started quickly and gracefully shutdown.
Containers are a great match with these cloud-native requirements. However, they’re also a great match for providing persistent storage for hybrid cloud-native applications. Not all cloud apps are stateless or can depend on a fixed backing store that’s somewhere else. Data has gravity and it matters where the data lives.
저자 소개
Gordon Haff is a technology evangelist and has been at Red Hat for more than 10 years. Prior to Red Hat, as an IT industry analyst, Gordon wrote hundreds of research notes, was frequently quoted in publications such as The New York Times on a wide range of IT topics, and advised clients on product and marketing strategies.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.