Diving into data lakes: Key considerations for payers

“Data lake” is a concept that has recently attracted a lot of attention in the healthcare industry.

These massive repositories of both raw and refined data can be invaluable assets for organizations looking to better manage and leverage their data. Data lakes allow for data to flow back and forth between the lake and each transactional system—versus most healthcare data flows, which are unidirectional—and don’t require expensive computer infrastructure.

Data lakes are piquing the interest of payers, in particular, as these organizations deal with massive amounts of notoriously difficult and complex data—healthcare transactions, physician notes, and digital health applications, to name a few—which inform decisions on individual member and population health. Although data lakes have great potential, payers should consider several factors carefully before diving in.

Worth the Hype?
Across the healthcare industry, a proactive approach to consuming data has emerged as a best practice, fueled by enhanced tools and technologies and coupled with advances in data science computing, all of which have curbed costs. Many payers have already made great strides in their data science programs; for example, most have developed data warehouses to quickly and efficiently gather actionable insights from data across the enterprise.

The following examples demonstrate how payers can implement and leverage data lakes to improve member outcomes and reduce excess costs.

Reduce fraud, waste, and abuse. The National Healthcare Anti-Fraud Association reports that healthcare organizations lose $80 billion to fraud each year. Running an analysis that simultaneously uses raw data while claims are batch-processed could help payers see potential instances of fraudulent or wasteful claims. Data lakes can also be used to run algorithms that highlight specific prescribing patterns that may be problematic and need attention, such as opioid prescribing patterns that don’t follow national guidelines.

Recognize and address social determinants of health. Clinicians at UAMS Medical Center in Little Rock, Arkansas, ask members about non-clinical details, such as eating habits and home life, and upload this information into their electronic health record (EHR). Over the past few years, this initiative has reduced hospital admission rates by 10 to 13.8 percent. Payers can use a data lake to pull out raw consumer financial data, for example, and gain a more complete picture of members with behavioral health needs that are not being addressed. The data can also highlight social factors that significantly impact members’ mental health conditions, such as lack of stable housing or employment.

Identify increased rates of drug addiction by county. As the industry works to address escalating rates of opioid addiction in the United States, being able to identify members in specific geographic regions who are dealing with addiction allows payers to better target interventions, treatments, and programs. Using data from the lake can also enable payers to better recognize early signs of an epidemic and proactively work with other organizations and stakeholders in the community to come up with potential solutions.

Combine with census data and consumer education data to improve communication. By adding census data and consumer education data into the data lake, payers get a better view of their member populations, which could have benefits such as identifying instances of healthcare illiteracy. Health illiteracy may hinder care management plans for high-risk members because they may not understand discharge instructions or medication regimens. Using data from various sources allows payers to ensure members receive instructions that were written for people with the appropriate literacy levels.

A data lake infrastructure supports high-impact data science for improved health, and the opportunities for enhancing member health by leveraging data lake-driven insights are infinite. The better the data science, the better the solutions.

What to Know Before Taking the Plunge
While data lakes are gaining traction, the task of developing, implementing, and managing a data lake is no small feat. Organizations need to be smart and approach the initiative with realistic expectations and a solid strategy.

The main reasons that payer data lake programs can fail include:

Lack of high-level support. Senior leadership must buy in and commit to the investment. Organizations without strong support from the executive team struggle to keep data lake implementation moving forward, and the project often sputters when finally set into motion.

Rushed design. If the architecture of the data lake framework is developed too quickly, and without careful and methodical consideration of how it will be used, the data lake may not be able to fulfill its desired requirements.

Poor data-matching. In a data lake environment, where much of the data is unstructured, using the standard, deterministic “keys” to match relevant data isn’t viable because the fields for using those keys don’t exist. Payers that don’t define their logic for merging data with different attributes and from different systems may gather data that is disconnected and useless.

To capitalize on the capabilities of a data lake, an organization must specifically state its purpose and goal, comprehend the skills and degree of expertise needed, identify the appropriate internal and/or external resources and support, and target the leaders who will help to drive the initiative forward. With a strong foundation, a strategic data lake will provide valuable data-driven insights and the desired return on investment.

Sumant Rao is senior vice president and business owner of performance analytics at Cotiviti, a leading provider of payment accuracy, risk adjustment, and quality and performance solutions for at-risk organizations.

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.


Featured Whitepapers

Featured Webinars