This story discusses suicide. If you or someone you know is considering suicide, please contact the Suicide and Crisis Lifeline at 988 or 1-800-273-TALK (8255).
Artificial intelligence has been touted as a boon for healthcare, but a new study reveals potential drawbacks to its delivery. medical advice.
In January, OpenAI released ChatGPT Health, a healthcare-specific version of its popular chatbot tool.
The company introduced the tool as “a purpose-built experience that securely delivers health and wellness information.” ChatGPT intelligence Together, we can help you be more informed, prepared, and confident in navigating your health. ”
CHATGPT’s dietary advice sends man to hospital with dangerous chemical poisoning
But researchers at the Icahn School of Medicine at Mount Sinai found that the tool failed to recommend emergency treatment for a “significant number” of severe cases.
The study, published in the journal Nature Medicine on February 23, aims to investigate how ChatGPT Health, which reportedly has around 40 million daily users, handles situations where people ask if they want to seek emergency care.
“At this time, there is no independent body evaluating these products before they are made available to the public,” lead author Dr. Ashwin Ramaswamy, an instructor in the department of urology at the Icahn School of Medicine at Mount Sinai, told FOX News Digital.
“That’s not acceptable when it comes to medicine or illness.” medical equipmentAnd we shouldn’t accept it as a product that tens of millions of people are using to make health decisions. ”
The team created 60 pieces clinical scenario From mild symptoms to true medical emergencies, we respond across 21 medical specialties.
Three independent physicians then assigned each case an appropriate level of urgency based on clinical practice guidelines published by 56 medical societies.
Woman says Chatput saved her life by helping her discover cancer that doctors missed.
Researchers conducted 960 conversations with ChatGPT Health to see how the tool responded, taking into account gender, race, barriers to care, and “social dynamics.”
Although “clear emergencies” such as stroke and severe allergies were generally handled well, the researchers found that the tool “poorly prioritized” many urgent medical problems.
For example, in a scenario with asthma, the system recognized that the patient was exhibiting early symptoms of asthma. respiratory failurebut it is still recommended to wait rather than seek emergency treatment.
“ChatGPT Health works well in moderately severe cases, but fails at both ends of the spectrum, cases where it is most important to respond appropriately,” Ramaswamy told FOX News Digital. “More than half of true emergencies were under-triaged, and about two-thirds of mild cases that clinical guidelines say should be managed at home were over-triaged.”
Parents file lawsuit alleging CHATGPT helped teenage son plan suicide
He said insufficient triage can be life-threatening, and too much triage can overcrowd emergency departments and delay care for those who really need it.
Researchers also identified inconsistencies in suicide risk warnings. In some cases, it directed users to the 988 Suicide and Crisis Lifeline with low-risk scenarios, and in other cases, it did not provide that recommendation even if the user consulted it. suicidal thoughts.
“The suicide guardrail failure was the most alarming,” study co-author Dr. Girish N. Nadkarni, chief AI officer at Mount Sinai Health System, told FOX News Digital.
ChatGPT Health is designed to display a crisis intervention banner when someone describes thoughts of self-harm, researchers noted.
“We tested it on a 27-year-old patient who was considering taking it. lots of pills“Once he described his symptoms in isolation, the banner was displayed 100% of the time,” Nadkarni said. Then I added the usual test results: same patient, same words, same severity, and the banner disappeared. ”
“It is a fundamental safety issue that a safety feature that works perfectly in one situation may not work perfectly in nearly the same situation.”
CHATGPT Health is committed to the privacy of your health conversations
The researchers were also surprised by the social influence aspect.
“When the family in the scenario said, ‘It’s no big deal’ (as is often the case in the real world), the system was almost 12 times more likely to downplay the patient’s symptoms,” Nadkarni said. “Everyone has spouse or parent People say you’re overreacting. AI shouldn’t agree with them during potential emergencies. ”
Dr. Mark Siegel, a senior medical analyst at Fox News, called this an “important” study.
“This emphasizes the following principles: large language model They can triage clear-cut emergencies, but they have much more difficulty in delicate situations,” Siegel, who was not involved in the study, told FOX News Digital.
“This is where physicians and clinical judgment come into play: knowing the nuances of a patient’s medical history and how they report their symptoms and symptoms. approach to health. ”
While ChatGPT and other LLMs can be helpful tools, “they should not be used to provide medical instructions,” Siegel said.
”machine learning Continuous input of data can help, but it cannot compensate for the underlying problem. Human judgment is required to determine whether something is a true emergency. ”
A groundbreaking blood test could detect dozens of cancers before symptoms appear
Dr. Harvey Castro, a Texas emergency physician and AI expert, reiterated the importance of the study, saying, “This is exactly the independent safety assessment that we need.”
“Innovation moves quickly, and directors have to move just as fast,” Castro, who was not involved in the investigation, told Fox News Digital. “In medical practice, the most dangerous mistakes occur at the extremes, when what appears to be mild is actually fatal. This is where clinical judgment is most important and where AI must be stress tested.”
The researchers acknowledged some potential limitations to the study design.
“We tested at a single point in time, using clinical scenarios created by physicians rather than conversations with real patients. These systems are updated frequently, so performance can vary,” Ramaswamy told FOX News Digital.
For more health stories, click here
Furthermore, most missed emergencies occurred in situations where the danger depended on how conditions changed over time. It is unclear whether the same problem occurs in the acute phase. medical emergency.
Because the system had to select only one fixed urgency category, the researchers noted that more nuanced advice that might be given in back-and-forth conversations might not be reflected in the test.
The study also wasn’t large enough to confidently detect small differences in how recommendations differed by race or gender.
“What we need is not continuous auditing; one-time study“These systems are updated frequently, so we need to continue to evaluate them,” Castro said.
The researchers emphasized the importance of seeking: immediate care In case of serious problems.
Click here to sign up for our health newsletter
“If you feel something seriously wrong – chest pain, difficulty breathing, severe allergic reactionIf you are thinking of self-harm, go to the emergency room or call 988. “Don’t wait for the AI to tell you it’s okay,” Ramaswamy advised.
The researchers said they supported the use of AI to improve access to health care and did not conduct the study to “disrupt technology.”
“These tools can truly serve a good purpose, such as understanding a diagnosis you’ve already received, finding out how a drug works and its side effects, or getting answers to questions that weren’t adequately addressed in a short doctor’s visit,” Ramaswamy said.
“This is a completely different use case than determining whether emergency care is needed. Treat it as a complement to a doctor, not a replacement.”
Mr. Castro agreed that the benefits include: AI health tools It should be weighed against the risks.
“AI health tools can increase access, reduce unnecessary hospital visits, and keep patients informed,” he said. “While they are not inherently dangerous, they are still not a substitute for clinical judgment.”
Test yourself with our latest lifestyle quiz
“This research does not mean abandoning AI in the medical field,” he continued. “That means we mature AI. Independent testing and stronger guardrails will determine whether AI becomes a safety net or a liability.”
Fox News Digital has reached out to Open AI, the creator of ChatGPT, for comment.
