Mining for data gold: Transforming healthcare with natural language processing

Anand Shroff, Health Fidelity Co-Founder and Chief Technology Product Officer -

Healthcare IT executives know they need to find a way to mine unstructured data to generate maximum value in an increasingly data-driven environment, and many are considering natural language processing (NLP) as the most effective solution.

With the widespread use of electronic health records (EHRs) now a reality, unstructured data such as physician notes, lab reports, admission and discharge notes, and other forms of free text are a veritable gold mine, and NLP is the best way to extract value from these sources. These clinician-generated narratives have the most accurate and complete picture of a patient's medical history, and they often contain valuable insights that would be missed if one relied only on structured data available for any patient.

All healthcare organizations alike – hospitals, health systems, health plans, researchers, biotech firms, pharmaceutical companies, and government agencies – understand the value of the insights that can be gleaned from unstructured data. Hospitals and clinics can use it to automate coding and comply with ever-increasingly complicated coding requirements. Payers and providers can use data to improve population health and revenue cycle management processes. The U.S. Department of Health and Human Services can use this data to create more accurate reimbursement models that include population risk factors as it transitions from fee-for-service to value-based care.

The Promise and Complexity of NLP

For these reasons, the healthcare NLP industry is in hyper-growth: MarketsandMarkets estimated last year that healthcare NLP will be a $2.67 billion category by 2020, with a CAGR of 19.2%. Since approximately 80% of all clinical data resides in unstructured formats transformable only through NLP, there is enormous marketplace interest. But while the promise of NLP is understood in and outside of the healthcare IT community, the development of healthcare-oriented reliable solutions has been slow.

That's because NLP development is a highly complex endeavor that involves multiple challenges, such as named entity recognition, morphological segmentation, disambiguation, and sentiment analysis. Said another way, the esoteric nature of the language used in healthcare demands careful attention paid to interpreting that data. For example, a diagnosis recorded in an EHR narrative can have a wide range of meanings due to each physician's expression variances and organization-specific language.

Additionally, the amount of industry-specific terminology and jargon used in healthcare further complicates solution development, requiring developers to factor in SNOMED, ICD-9, ICD-10, LOINC, RxNorm, and other coding systems. Another complicating factor is the massive cost of developing an in-house solution; most healthcare IT organizations lack the resources for a successful solution development project. And extraction algorithms are not only incredibly complex, they also require continuous updating to remain accurate and relevant – so the time and money investment is ongoing.

While the NLP market is poised for massive growth over the next several years, it's not a new phenomenon; researchers at universities and major organizations have been working on developing viable NLP systems since the 1950s. MIT and Stanford investigated its possibilities early on. More recently, Columbia University's pioneering healthcare NLP research yielded breakthroughs such as a semantically based parser for determining the structure of text, gaining a significant amount of acceptance at academic medical centers.

This research and development was led by Dr. Carol Friedman, Professor of Biomedical Informatics at Columbia, whose work generated the earliest patents, the most peer-reviewed publications, and validation in hundreds of successful projects. Dr. Friedman's work took healthcare NLP out of the lab and into viable applications that are now used on a daily basis.

Getting to Transformative Change with NLP

With accurate and affordable healthcare NLP applications now available, data-driven research groups and analytics teams have the opportunity to integrate this technology into their operations. As a first step, groups are identifying the primary use cases, which include risk adjustment, computer-assisted coding, population health, quality improvement, and more effective clinical research. IT executives who are looking to bring NLP technology onboard should identify relevant applications and create a strong business case for their management team.

The next step is to evaluate technical requirements. It's important to make sure the solution can extract actionable information from unstructured data contained in EHRs. It's also vital to make sure the system infrastructure supports data extraction, normalization, and integration. Systems must work together seamlessly to deliver maximum business value.

Lastly, it is crucial to choose a vendor whose objectives and track record align with the organization's priorities — healthcare IT should make sure the NLP technology is proven for the specific use cases the organization wants to address. Organizations should also choose an NLP solution capable of continuous learning and improvement. A system with general precision and recall metrics in the 90-plus percentage range over several use cases is ideal, as well as the ability to improve its results over time. The best NLP technologies can deliver these results in the most challenging healthcare environments.

Healthcare IT organizations that develop a solid business case and put the right system in place can achieve transformation via NLP, significantly improving quality and productivity. For the first time, Healthcare IT applications can extract gold from unstructured data in an accurate and cost-effective manner. It's not a future dream, it's a working reality today, and healthcare IT leaders who want to succeed in an increasingly data-driven environment should take full advantage of the opportunity NLP presents.

The views, opinions and positions expressed within these guest posts are those of the author alone and do not represent those of Becker's Hospital Review/Becker's Healthcare. The accuracy, completeness and validity of any statements made within this article are not guaranteed. We accept no liability for any errors, omissions or representations. The copyright of this content belongs to the author and any liability with regards to infringement of intellectual property rights remains with them.​

Copyright © 2024 Becker's Healthcare. All Rights Reserved. Privacy Policy. Cookie Policy. Linking and Reprinting Policy.