Mass General Brigham: Generative AI not 'best of both worlds' yet

Giles Bruce - Thursday, April 25th, 2024

More than half of patient portal messages written by artificial intelligence did not require any further editing, but the messages that did posed a risk to patients, according to a study at Somerville, Mass.-based Mass General Brigham.

Six radiation oncologists reviewed answers to 100 fake patient questions related to cancer, not knowing whether the responses were generated by human physicians or AI, per the research published April 24 in The Lancet Digital Health. In about a third of the cases, the oncologists thought the responses produced by OpenAI's GPT-4 large language model were written by humans.

Of the AI-generated responses, the physicians deemed 82.1% safe and 58.3% OK to send to patients without further editing. However, 7.1% of the AI-written answers presented a risk to patient safety if left unedited, with 0.6% posing a risk to death.

"Generative AI has the potential to provide a 'best of both worlds' scenario of reducing burden on the clinician and better educating the patient in the process," said corresponding author Danielle Bitterman, MD, faculty member at the AI in Medicine Program at Mass General Brigham, in an April 24 news release. "However, based on our team's experience working with LLMs, we have concerns about the potential risks associated with integrating LLMs into messaging systems."

She said the study illustrated the need for quality monitoring of large language models, training for clinicians on appropriately supervising AI's output, increased AI literacy for both patients and providers, and more research on how to address AI's errors.

Mass General Brigham: Generative AI not 'best of both worlds' yet

Featured Learning Opportunities

Featured Whitepapers

Featured Webinars