The dangers of healthcare generative AI ‘drift’

IT leaders are embracing generative AI in healthcare but also expressing concerns that the technology can “drift.”

Advertisement

The performance of GPT-4, the large language model that powers ChatGPT, in answering healthcare questions can change over time, a phenomenon known as “drift,” according to a study by researchers at Somerville, Mass.-based Mass General Brigham. Their work was published Aug. 8 in NEJM AI.

“Generative AI performed relatively well, but more improvement is needed for most use cases,” said corresponding author Sandy Aronson, executive director of IT and AI solutions at Mass General Brigham Personalized Medicine, in an Aug. 13 statement. “However, as we ran our tests repeatedly, we observed a phenomenon we deemed important: running the same test dataset repeatedly produced different results.”

Mr. Aronson and his fellow researchers were analyzing whether the technology could scan scientific articles to help geneticists with assessments of genetic variants. The variability of the results could differ across days, so the authors say the AI’s performance needs to be continuously monitored.

At the Becker's 11th Annual IT + Revenue Cycle Conference: The Future of AI & Digital Health, taking place September 14–17 in Chicago, healthcare executives and digital leaders from across the country will come together to explore how AI, interoperability, cybersecurity, and revenue cycle innovation are transforming care delivery, strengthening financial performance, and driving the next era of digital health. Apply for complimentary registration now.

Advertisement

Next Up in Innovation

Advertisement

Comments are closed.