ChatGPT health feature fails to detect high-risk cases

ChatGPT health features fail to detect “high-risk emergencies” or when people need immediate care, according to a new study.

Health questions are one of the most common uses of chatbots of artificial intelligence like ChatGPT, according to its creators, OpenAI. Its popularity is such that earlier this year the company introduced a new tool, ChatGPT Health, designed specifically to help people with their well-being, and the company says tens of millions of people are already using it.

But a new study suggests the system could miss important emergencies and can’t be trusted to reliably tell someone they need immediate medical attention.

“Large language models have become the first choice of medical consultation for patients, but in 2026 they are less secure at the clinical extremes, where the criteria distinguish between missed emergencies and unnecessary alarms,” said Isaac S. Kohane of Harvard Medical School, who was not involved in the research. “When millions of people use an AI system to decide if they need emergency care, the stakes are high. Independent assessment should be routine, not optional.”

The urgent need to test whether the system was secure led to an accelerated study by the Icahn School of Medicine at Mount Sinai, published in Nature Medicine.

The work arose from the recognition that ChatGPT was used for potentially crucial cases, but that research into its effectiveness was relatively limited. The gap between both factors motivated the study, according to the researchers.

“We wanted to answer a very basic but crucial question: If someone is experiencing a real medical emergency and comes to ChatGPT Health for help, will it clearly tell them to go to the ER?” said lead author and urologist Ashwin Ramaswamy. The researchers found that this was not the case, at least in enough cases to question its reliability.

The researchers found, for example, that the system’s alerts were “inverted”: the higher a person’s risk of self-harm, the less likely an alert would be triggered. This finding was “particularly worrying and surprising,” they said.

In the research, doctors created 60 scenarios covering 21 medical specialties. These ranged from relatively low-risk situations that might only require home care to actual medical emergencies. The researchers used 16 different contextual conditions, such as race and gender.

The researchers found that the tool generally handled obvious emergencies well, but was not effective enough in more than half of the cases where doctors decided the person would need urgent care. While it was effective for typical emergencies, it was less effective in detecting cases where the danger might be less immediate or obvious, they noted.

The work is published in a paper titled “Performance of ChatGPT Health in a Structured Test of Triage Recommendations,” which has been quickly published on Nature Medicine.

By Editor

One thought on “ChatGPT health feature fails to detect high-risk cases”
  1. https://www.poslovni.hr/hrvatska/od-igara-na-srecu-u-proracun-se-2025-slilo-388-milijuna-eura-4523858
    https://www.instapaper.com/p/SrinivasRaju
    https://newprincearts.edu.in/top-casino-rewards-casinos-mit-1-e-einzahlung-2026-expertenubersicht/
    https://repo.getmonero.org/-/snippets/7233
    https://shedefined.com.au/career/work/best-streamline-practices-for-managing-a-remote-dedicated-software-team/
    https://www.crunchytricks.com/2022/05/things-to-know-when-hiring-svelte-developers.html
    https://www.stechguide.com/5-best-countries-for-offshore-software-development-in-2022/
    https://paizo.com/people/EthanRoux
    https://www.bookkeepers.org.uk/Forum/?user=AtharvMisra
    https://techqlik.com/nft-developers-for-hire/
    https://piptle.agency/beste-online-casinos-mit-schneller-auszahlung-2025/
    https://www.thephotoforum.com/members/webbinsight.319776/
    https://www.mixcloud.com/YashpalGehlot/
    https://cracktech.net/3-ways-to-hire-dedicated-flutter-developers-for-your-next-project/
    https://kumu.io/MathisGarcia/data-science-et-aleatoire-pourquoi-ce-site-assure#untitled-map
    https://www.naaccr.org/forums/users/girojot691dolofan-com/
    https://inbloon.com/component-advantage-why-you-should-hire-dedicated-reactjs-developers-for-enterprise-ui-ux/
    https://gravatar.com/briefkittya9b1506609
    https://programminginsider.com/uk-eta-everything-you-need-to-know-before-traveling/
    https://files.fm/SiddheshDalvi
    https://downbeach.com/news/2025/nov/06/eta-uk-everything-you-need-to-know-about-the-united-kingdoms-new-travel-authorization-system/
    https://www.thesims3.com/mypage/DevanshTiwari/myBlog
    https://imageevent.com/gurpreetsandhu
    https://woahtech.com/how-to-hire-backend-developer/
    https://www.tricksroad.com/2021/05/when-hire-experienced-nestjs-programmer.html

Leave a Reply