Log in / Register Account
Jump to section

Understanding data services

Copy URL

Data services bring more business value to data so it can be implemented as part of cloud-native applications—an integral component of open hybrid cloud IT strategies.

Data services (or Data-as-a-Service) are collections of small, independent, and loosely coupled functions that enhance, organize, share, or calculate information collected and saved in data storage volumes. Data services amplify traditional data by improving its resiliency, availability, and validity, as well as adding characteristics to data that it doesn't already have natively—like metadata.

 

 

Data services are self-contained units of software functions that give data characteristics it doesn't already have. Data services can make data more available, resilient, and comprehensible, which makes data more useful to users and programs.

Data service functions turn inputs into outputs. The inputs are varied sets of raw data—data that hasn’t been processed for a specific purpose—configured in its native format and saved in physical, virtual, or cloud-based storage volumes. The outputs are usually:

  • Organizational: The consolidation, batching, and structure of data, usually pulled from structured (databases), semi-structured (data warehouses), or unstructured (data lakes) sources.
  • Transferable: The movement of data from their place of origin across a network to an end point, like an application or platform.
  • Procedural: The processing of data, usually as part of data modeling, analytics, or artificial intelligence/machine learning (AI/ML) software.

Data at rest

Data saved in storage volumes. Data services abstract raw data from their sources—like customer records from online transactional processing (OLTP) databases, property damage information from data warehouses, and images or videos from data lakes—and apply governance principles, organization, and maintenance that make data useful to applications and accessible by users. Data services are an important part of big data strategies because it can make sense of massive collections of structured, semi-structured, and unstructured data stored all over the place.

 

Data in motion

Data moving from its storage origin to an application or platform, usually in real-time. Data services can create data pipelines to help data move continuously between multiple endpoints. For example, data services can help organizations shift from batch-oriented data processing to event-driven data processing by operating on data immediately as it is generated. Data services also help ensure data is never actually removed from its origin—allowing multiple endpoints to use the same datapoint at once. This can be used to create scalable, event-driven architectures.

 

Data in action

Active data grouped into data sets being used by data science, data analytics, and data modeling software. Data services help improve data access to high-performance, intelligent data processing platforms—like AI/ML and deep learning tools. Depending on the data service, data in action could involve collections of small, independent, and loosely coupled services—usually packaged in containers and orchestrated by a Kubernetes platform.

 

Without data services that help developers and data scientists collaborate as data moves between systems, cloud-native application development is impossible. Multiple code commits that use the same data can extend build times, but a data service like Red Hat® OpenShift® Data Foundation can reduce time dependencies on concurrent builds.

Traditional storage

The actual collection and retention of raw digital information—the bits and bytes behind applications, network protocols, documents, media, address books, user preferences, and more. When you save a document and select a location, you are going through the process of data storage. A user’s view into data storage is usually at the infrastructure level, and is rarely connected between storage volumes. For example, there’s usually not a native way to view every file, block, or object saved across a workstation, cloud storage provider, and external hard drive—making the act of exploring data storage very manual and monolithic.

 

Data services

Software that uses data saved in traditional data storage volumes as inputs to create specific outputs; or software that amplifies traditional data by improving its resiliency, availability, and validity. Users typically interact with data services as part of an application, making the process very flexible and customizable. For example, the data service provided by Red Hat OpenShift Data Foundation abstracts storage infrastructure so data can be stored in many different places—but act as a single persistent repository.

The Massachusetts Open Cloud (MOC) uses data services. The MOC is a nonprofit initiative of universities, government organizations, and businesses. It was formed to develop a common, cloud-based infrastructure for businesses, governments, and nonprofits to analyze big data. MOC used Red Hat Ceph Storage—a software-defined storage service—to organize and share large amounts of data with multiple entities running custom data analytics platforms.

With no prior experience with OpenShift Container Storage, our team was able to set up 2 distinct OpenShift clusters and conduct full Db2 Warehouse Performance validation in less than 2 weeks.

Piotr Mierzejewski
Director Db2 Development IBM Data & AI

Because our data services not only work well with every data storage provider, but our data services are built to compliment cloud-native application development

So use any datacenter or cloud you want, and start implementing all that data into your ever-evolving cloud-native apps. With our data services, your enterprise’s old data can be enhanced and streamed right into your cloud-native apps to reveal important information that may solve tomorrow’s biggest challenges.

Check out how Red Hat Ceph Storage performed as part of Evaluator Group’s 10 billion object test.

Keep reading

Topic

Understanding big data

Big data is data that is either too large or too complex for traditional data-processing methods to handle.

Article

Why choose Red Hat storage

Learn what software-defined storage is and how to deploy a Red Hat software-defined storage solution that gives you the flexibility to manage, store, and share data as you see fit.

Article

What is cloud storage?

Cloud storage is the organization of data kept somewhere that can be accessed by anyone with the right permissions over the internet. Learn about how it works.

More on storage

Products

Red Hat OpenShift Data Foundation

Software-defined storage that gives data a permanent place to live as containers spin up and down and across environments.

Red Hat Gluster Storage

A software-defined storage platform that can be deployed on bare-metal, virtual, container, and cloud environments.

Red Hat Ceph Storage

An open, massively scalable, software-defined storage system that efficiently manages petabytes of data.

Red Hat Hyperconverged Infrastructure

Co-located, scalable, software-defined compute and storage on economical, industry-standard hardware.

Resources