New York City-based NYU Langone Health tested artificial intelligence to see how well it can convert physician notes into accurate lay language that improved patient understanding.

The study, published March 11 in JAMA Network Open, used a licensed "private instance" of the GPT-4 tool created by OpenAI. The license allowed front-line clinicians to experiment with the tool using real patient data while adhering to privacy laws, according to a health system news release. Researchers used the tool to convert the text in 50 patient discharge notes into lay language.

The team ranked the AI translations using the Patient Education Materials Assessment Tool, which scores the ability of patients to understand the material, and found the translation scored 81% — up from 13% for the original physician-written reports. The AI dropped the reports from an 11th-grade reading level to a sixth-grade level.

Two physicians reviewed the AI discharge summaries for accuracy based on a six-point scale and awarded the AI notes 54% for best-possible accuracy rating and 56% for being entirely complete. Researchers say the results signify that at the current performance level, providers would not have to make a single change to more than half of the AI summaries.

"GPT-4 worked well alone, with some gaps in accuracy and completeness, but did more than well enough to be highly effective when combined with physician oversight, the way it would be used in the real world," senior study author Jonah Feldman, MD, medical director of clinical transformation and informatics within NYU Langone's Medical Center Information Technology Department of Health Informatics, said in the release. "One focus of the study was on how much work physicians must do to oversee the tool, and the answer is very little. Such tools could reduce patient anxiety even as they save each provider hours each week in medical paperwork, a major source of burnout."