Some of the biggest technology trends aren’t necessarily about doing something new. Things like cloud computing (as an environment) and design patterns for the Internet of Things and mobile applications (as business drivers) are building on existing conceptual foundations -- virtualization, centralized databases, client-based applications. What is new is the scale of these applications and the performance expected from them.
That demand for performance and scalability has inspired an architectural design called distributed computing. Technologies within that larger umbrella used distributed physical resources to create a shared pool for that service.
One of those technologies is the purpose of this post -- in-memory data grids. It takes the concept of a centralized, single database and breaks it into numerous individual nodes, working together to create a grid. Gartner defines an in-memory data grid as "a distributed, reliable, scalable and ... consistent in-memory NoSQL data store[,] shareable across multiple and distributed applications." That nails the purpose of distributed computing services: scalable, reliable, and shareable across multiple applications.
A Look at a Distributed Architecture
Distributed computing is a very broad term that covers many different technologies and services, but a basic definition is that a particular service is located and shared among several servers, in a pool. Both frontend and backend applications interact with that pool, rather than any single server instance, so that pool can be expanded or contracted dynamically without affecting any application.
There is a subset of distributed computing called in-memory computing. More traditional architectures use data stores which have synchronous read-write operations. This is great for data consistency and durability, but it is very easy to bottleneck if there are a lot of transactions waiting in the queue.
There have been significant advancements in computer hardware, especially store devices (like solid state drives). It’s proportionally cheaper to have a lot of storage capacity now than it was a few years ago, and the hardware quality is better. Additionally, there are changes in operating environments (e.g., cloud) and business initiatives (Internet of Things) that are pushing for highly responsive, data-rich applications.
In-memory computing adds an additional layer within an environment, which uses the random access memory (RAM) on the physical systems to house most or all of the data required by client applications. Many (though not all) of in-memory computing technologies are related to data, including data grids, complex event processing, and analytics.
With a data grid, that layer is in between the application and the data store. In-memory data grids use a cache of frequently accessed data in that active memory and then can access the backend data store as needed and even asynchronously to send and receive updates.
Using the data grid moves data closer to the endpoints where users interact with it. This increases responsiveness and can lower transaction times from hours to fractions of a second.
Advantages and Uses
TechTarget defines three attributes for when data grids are most advantageous: velocity, variability, and volume. In-memory data grids are best suited for environments where there is a lot of data (volume) coming in simultaneously or continually (velocity) from different sources or formats (variability). Another way of saying it is performance and scalability.
From an architectural perspective, scalability and performance are met directly:
- Dynamic, horizontal scalability, based on service load without affecting either application or backend database configuration.
- Large-scale transaction processing (hundreds of thousands per second) in a distributed system that is fault-tolerant.
- Cloud-native architecture, which is interoperable across different environments (on-premise, hosted, cloud).
There are some less immediately obvious advantages because of how data grids interact with both data sources and applications.
Another way to look at the data grid is to treat it as an abstraction data layer, sitting between multiple data streams and data storage devices. In some environments, a data grid could be used as an integration method for multiple data backends.
Changes in architecture, such as microservices, are also changing how environments can ingest and respond to changing data. In more traditional architectures with discrete applications, the workflow can be very sequential -- first you receive data, then you store it, then you retrieve it, then you run it through an analytics program, then you take those analytics (usually in graphs and charts) and overlay it on some business logic. It is possible to cut out some of those intermediate steps. For example, the same data streams could be used for real-time information and also for data analytics -- and those analytics could be fed directly from the data grid into a given application (like a BPM engine) according to defined queries. There’s no need to break out analytics as something separate; it can be part of the application.
There are some drawbacks to in-memory data grids (as with distributed computing generally): increased complexity, a lack of skilled engineers familiar with the technology, and a lack of standards. If data access and responsiveness are not critical for a specific application, these disadvantages may point to more traditional solutions.
Still, in-memory data grids offer an important tool that can help realize emerging digital transformation initiatives because it helps that critical data layer handle the velocity, volume, and variability of modern data streams.
More Resources
- Go Big and Fast or Go Home: Data Grids Meet Data Virtualization in Modern Data Architectures (Red Hat-sponsored analyst report)
- Fast, scalable, highly available applications (technology datasheet)
저자 소개
Deon Ballard is a product marketing manager focusing on customer experience, adoption, and renewals for Red Hat Enterprise Linux. Red Hat Enterprise Linux is the foundation for open hybrid cloud. In previous roles at Red Hat, Ballard has been a technical writer, doc lead, and content strategist for technical documentation, specializing in security technologies such as NSS, LDAP, certificate management, and authentication / authorization, as well as cloud and management. She also wrote and edited the Middleware Blog for Red Hat and led portfolio solution marketing for integration and business automation.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래