Login Account
Login / Registre-se Account

How to maximize your healthcare analytics with big data

Última atualização: 26 de dezembro de 2018

Government healthcare providers and payers are adopting big data analytics to lower costs while improving outcomes. The need is urgent. The Centers for Medicare and Medicaid Services (CMS) predicts that health spending will grow 1.2% faster than the gross domestic product (GDP) from 2016-2025. In the same period, healthcare costs will rise from 17.8% to 19.9% of the GDP.

Analytics provide the foundation for a growing number of government healthcare initiatives. These include population health management, value-based care, fraud and waste reduction, precision medicine and bioinformatics, real-time patient monitoring, hospital re-admissions reductions, and clinical decision support.

What do you need to succeed with analytics? Analytics software receives the most attention. However, your underlying data platform also plays a major role. Compared to proprietary platforms, open source technologies make it faster and less expensive to:

  • Integrate healthcare data from multiple sources and in different formats into a 360-degree patient view.
  • Ingest and respond to real-time or near-real-time data, such as data from wearable heart or glucose monitors.
  • Minimize storage costs by automatically deciding whether to store or discard incoming data as it arrives.
  • Detect and respond to critical events, for instance, sending a message to a clinician and patient if a patient has not refilled a prescription.
  • Quickly modify automated rules as policies change.
  • Comply with security and privacy requirements, such as protecting personal health information (PHI) and personally identifiable information (PII).
  • Utilize existing identity management systems and adhere to existing policies.

Read this whitepaper to understand the important role your data platform plays across a variety of popular healthcare analytics use cases.


Healthcare providers and payers are using analytics across the public sector. For example, National Intrepid Center of Excellence (NICoE) at Walter Reed National Military Medical Center helps practitioners more quickly identify interventions most likely to help service members suffering from traumatic brain injury. Diabetes Center of Excellence (DCoE) for the Military Health System Patient Safety Program developed a prototype to recommend patient care based on existing patient information. The CMS Center for Clinical Standards and Quality uses analytics in its quest to improve national healthcare quality outcomes.

But insights are only as good as the underlying data. Data sets need to be complete, current, and secure. No amount of investment in analytics software will make up for a weak data foundation. Enterprise data warehouses collapse under the demands of big data, which include:

  • Velocity: New data arrives constantly — from updated electronic health records (EHRs), clinical tests, wearables, and more. New data that hasn’t yet been stored is alternately called real-time data, liquid data, or data in motion. By analyzing this data, you can take immediate action that affects outcomes. Use cases include predicting acute care episodes or tracking fastmoving epidemics.
  • Variety: The data platform needs to ingest data in different formats from many systems: the EHR; clinical, lab, and claims systems; Internet of Things (IoT) data from wearable monitors; genomics databases; and data from nonhealthcare agencies. For instance, an IRS database might reveal fraudulent claims submitted under the name of someone who is deceased.
  • Volume: Healthcare data grows exponentially. You need an intelligent way to determine which data to retain and discard, and a cost-effective way to store it.


Opportunity: Manage the health of populations with chronic conditions such as diabetes, heart diseases, or obesity. By some estimates, 10% of the population accounts for 66% of all healthcare costs.

Identifying high-risk patients is easier when agencies have a patient-centric data repository that combines information from clinical, claims, administrative, and financial data systems. Contributors include payers, Centers for Disease Control and Prevention (CDC), and National Institutes of Health (NIH).


The data platform for population health management needs to provide access to the right data, in the right format. That is challenging given the number and variety of systems contributing data to the data repository. These systems might include bedside devices, wearables, lab systems, EHRs, radiology systems, databases from overseas Department of Defense personnel, and even genomics databases. You also need a way to easily share the data with providers so they can engage with patients before they become sicker or spread their disease.


Opportunity: Payers reward high-quality, cost-effective patient care by shifting from fee-for-service compensation to value-based incentives. Analytics identifies effective clinical protocols, which are the foundation for evidence-based medicine.

Accountable Care Organizations (ACOs), can use analytics to maximize reimbursements. Medicare Access and CHIP Reauthorization Act of 2015 (MACRA) lays out two tracks for the Quality Payment Program (QPP). The track with higher compensation, Advanced Alternative Payment Model (APM), requires more advanced analytics because it rewards providers for taking more risk, tying payment to patient outcomes. The higher models of Bundled Payments for Care Improvement (BPCI), which tie payment to an episode of care, also require analytics.


A single episode of care might involve visits to multiple providers. The data platform needs to aggregate data from all providers and other sources—from any location and in any format.


Opportunity: Claims audits search for patterns and outliers indicating abuse, such as billing from a post office box, a hysterectomy for a male patient, or billing outside the 90-day global surgery period. In 2016, the government recovered over $3.3 billion as a result of healthcare fraud judgments, settlements, and additional administrative impositions in healthcare fraud cases and proceedings.


Analytics performed on snapshots do not consider the newest, often most useful, data. Therefore, including real-time and near-real-time data in claims audits speeds up identification of possible fraud and abuse. The most accurate insights into fraud and abuse require information from disparate sources, including the EHR; claims, clinical, and lab systems; and nonhealthcare systems, such as population databases.


Opportunity: Customize medical care to the patient’s unique genetic makeup. Care providers use precision medicine to predict the risk of organ rejection and guide more tailored immunosuppressive drug regimes, identify patients with a form of breast cancer not responsive to standard therapy, and find the right medication to treat depression.


The data platform chunks large data sets into computational clusters. A cancer treatment, for example, might work only for certain DNA sets. Software-defined storage using commodity hardware costs less than traditional arrays and simplifies growth.


Opportunity: Allow patients to return to their homes by providing wearable devices to measure blood flow, heart rhythms, respiration, glucose, and more. The patient and clinician receive alerts when patterns indicate an imminent acute care episode.


The data platform must securely acquire data from smart devices and add it to the 360-degree patient view. The data platform needs to follow storage rules to decide which data to keep and discard. For example, sorting through the data from 1 million wearable heart monitors that transmit 24 hours a day.

A rules engine automatically invokes an action in response to specified conditions. For example, a blood pressure spike can trigger a message to the patient and clinician. A patient’s failure to fill a prescription can trigger a message to the clinician. Also, a high A1C reading can add the patient record to a population health study.


Adopting analytics costs less and takes less time when you use an open data platform. Open source technologies make it faster and less expensive to:

  • Integrate traditional data warehouses and databases with new technologies like Apache Hadoop and Apache Spark.
  • Normalize data in real time instead of first moving it to a data warehouse, reducing data preparation from days to minutes.
  • Adapt to new legislative mandates, changes in the competitive landscape, emerging technologies, and expanding or contracting needs.
  • Avoid becoming locked into one approach or vendor and adopt innovative big data technologies as they become available.
  • Store exponentially growing data. Software-defined storage lowers costs compared to proprietary enterprise arrays.

Depending on your use case, you might want to analyze stored big data, real-time data that has not yet been stored, or both. Analyzing real-time data enables you to respond to threats and opportunities during a short window of opportunity—for example, by predicting an imminent acute care episode based on data from wearables. Analyzing real-time data in conjunction with static data enables more accurate predictions, such as spotting billing fraud as it happens.

Uniquely, the Red Hat® data platform for healthcare analytics provides technologies for analyzing stored data and real-time data in the same architecture (Figure 1). Building blocks in the platform include:

  • Application programming interface (API) bridge. Share data with other agency and payer systems, enabling inter-agency collaboration.
  • Storage. Create a data lake containing all data in any format, lowering costs and increasing storage utilization.
  • Big data Infrastructure-as-a-Service (IaaS). Increase resource utilization by delivering infrastructure on demand, using Red Hat Ceph Storage in conjunction with Apache Hadoop and Apache Spark.
  • Integration services. Integrate information from multiple applications into a real-time data stream.
  • Intelligent data services. Add context to real-time data by fusing it with external data sources.
  • Evidence-based rules. Invoke automated action in response to triggers based on business rules.
  • Data-as-a-Service (DaaS). Deliver the right data at the right time to care providers, developers, researchers, and analysts.
  • Security. Comply with healthcare security and privacy regulations, including HIPAA 5010 and Health Level Seven (HL7).
  • Platform-as-a-Service (PaaS). Deploy and manage the data platform on a public, private, or hybrid cloud.


Figure 1. Red Hat data platform for healthcare analytics

Figure 1. Red Hat data platform for healthcare analytics


How much your agency benefits from healthcare analytics depends on your data platform. Can you connect to all healthcare data sources? Integrate them to create a 360-degree view of the patient? Ingest and analyze real-time data as well as stored data? Define rules for automated responses to specified conditions? Change the triggers and actions in real time? You can if you build your data foundation using an open, secure architecture. You will deliver the right data in the right format to the right people, helping to reduce costs while improving outcomes.

Find out how to take advantage of big data and analytics in your organization—the open source way.

Learn more at