Mount Sinai flags AI bias in clinical decision-making

Advertisement

Generative artificial intelligence models can alter medical recommendations based solely on a patient’s socioeconomic or demographic background — even when clinical details are identical, according to a study conducted by the Icahn School of Medicine at Mount Sinai in New York City.

The study, published April 7 in Nature Medicine, evaluated nine large language models using 1,000 emergency department cases, each replicated across 32 unique patient profiles. In total, the models generated 1.7 million recommendations.

Researchers found the models occasionally altered triage priorities, diagnostic orders and treatment approaches based on nonclinical factors. For example, mental health assessments were recommended six to seven times more often for LGBTQIA+ patients than clinically indicated. In addition, higher-income patients were more often advised to receive CT scans or MRIs, while lower-income patients were told no further testing was necessary.

“Our research provides a framework for AI assurance, helping developers and healthcare institutions design fair and reliable AI tools,” study co-senior author Eyal Klang, MD, chief of generative AI in the Windreich Department of Artificial Intelligence and Human Health at Icahn School of Medicine, said in a news release.

The findings underscore the need for robust bias evaluation and mitigation strategies to ensure that AI-driven medical advice is equitable, researchers said. 

They also noted the study only offers a snapshot of AI behavior. Next, researchers plan to pilot AI in clinical settings to assess real-world impacts and test whether different prompting methods reduce bias.

Advertisement

Next Up in Artificial Intelligence

Advertisement