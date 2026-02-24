A study conducted by researchers at New York City-based Mount Sinai found that ChatGPT Health undertriaged more than half of clinical cases that required emergency care.

The study, published Feb. 23 in Nature Medicine by the Icahn School of Medicine at Mount Sinai, tested 60 patient scenarios across 21 specialties, comparing ChatGPT Health’s triage advice with physician consensus using guidelines from 56 medical societies. While the tool performed well in clear emergencies, such as stroke or anaphylaxis, it often reassured users in more complex cases — even when identifying dangerous symptoms.

The team conducted 960 interactions with the tool, varying for race, gender and access to care. Researchers also flagged inconsistent suicide-risk alerts, noting the system sometimes failed to respond when users described specific self-harm plans.

“This was a particularly surprising and concerning finding,” senior study author and Mount Sinai Chief AI Officer Girish Nadkarni, MD, said in a Feb. 24 news release from the health system. “While we expected some variability, what we observed went beyond inconsistency. The system’s alerts were inverted relative to clinical risk, appearing more reliably for lower-risk scenarios than for cases when someone shared how they intended to hurt themselves. In real life, when someone talks about exactly how they would harm themselves, that’s a sign of more immediate and serious danger, not less.”