Can AI Chatbots Reason Like Doctors?

OpenAI의 대형 언어 모델(LLM)이 실제 응급실 기록을 활용한 임상 추론 과제에서 의사들을 능가하는 성과를 보였다는 연구가 Science에 발표되었다. 그러나 의료용 챗봇의 신뢰성 문제, 평가 기준의 부재, 그리고 환각 현상 등 한계도 지적되고 있다. 의료 현장에서 LLM을 보조 도구로 활용하는 방안에 대한 연구와 임상 시험이 필요하며, AI와 의사의 협업 방식을 모색하는 것이 중요하다는 의견이 제시된다. 현재 의료용 AI는 빠르게 발전 중이나 규제와 책임 문제는 여전히 해결 과제로 남아 있다.

https://spectrum.ieee.org/ai-clinical-decision-support

#llm #clinicalreasoning #medicalai #openai #chatgpt

New Study Fuels Debate Over AI Clinical Decision Support

But researchers disagree on how to measure AI clinical reasoning

IEEE Spectrum

LLMs Struggle in Clinical Reasoning Despite Diagnostic Advances

When it comes to clinical reasoning, large language model chatbots still have a way to go, despite their impressive ability to deliver accurate diagnoses. While they're getting better at providing final answers, they struggle with the critical thinking needed to keep patients safe.

https://osintsights.com/llms-struggle-in-clinical-reasoning-despite-diagnostic-advances?utm_source=mastodon&utm_medium=social

#LargeLanguageModels #ClinicalReasoning #Healthcare #ArtificialIntelligence #DiagnosticAdvances

LLMs Struggle in Clinical Reasoning Despite Diagnostic Advances

Large language models improve at delivering diagnoses but struggle with clinical reasoning, putting patients at risk - learn why and read the study's findings now.

OSINTSights

McGraw Hill brings AI to med school with lifelike diagnostic simulations

https://web.brid.gy/r/https://nerds.xyz/2025/10/mcgraw-hill-ai-med-school-lifelike-diagnostic-simulations/

Can AI match doctors in clinical reasoning? OpenAI's new "o1-preview" model excels in diagnoses & reasoning but struggles with probabilistic tasks. Promising progress, but real-world trials are key to unlocking its potential #AI #Medicine #ClinicalReasoning
https://arxiv.org/abs/2412.10849
Superhuman performance of a large language model on the reasoning tasks of a physician

A seminal paper published by Ledley and Lusted in 1959 introduced complex clinical diagnostic reasoning cases as the gold standard for the evaluation of expert medical computing systems, a standard that has held ever since. Here, we report the results of a physician evaluation of a large language model (LLM) on challenging clinical cases against a baseline of hundreds of physicians. We conduct five experiments to measure clinical reasoning across differential diagnosis generation, display of diagnostic reasoning, triage differential diagnosis, probabilistic reasoning, and management reasoning, all adjudicated by physician experts with validated psychometrics. We then report a real-world study comparing human expert and AI second opinions in randomly-selected patients in the emergency room of a major tertiary academic medical center in Boston, MA. We compared LLMs and board-certified physicians at three predefined diagnostic touchpoints: triage in the emergency room, initial evaluation by a physician, and admission to the hospital or intensive care unit. In all experiments--both vignettes and emergency room second opinions--the LLM displayed superhuman diagnostic and reasoning abilities, as well as continued improvement from prior generations of AI clinical decision support. Our study suggests that LLMs have achieved superhuman performance on general medical diagnostic and management reasoning, fulfilling the vision put forth by Ledley and Lusted, and motivating the urgent need for prospective trials.

arXiv.org

It is important to be in the moment, but also to be aware of what “the moment” is: “There are 3 layers to a moment: Your experience, your awareness of the experience, & your story about the experience. Be mindful of the story.” @corymuscara

Relevant to teaching self- & situational-awareness to our #meded trainees (& to ourselves as medical educators), with direct effects on clinical reasoning & decision making.

#mindfulness #selfawareness #consciousness #metacognition #clinicalreasoning

- importance of specific, actionable, criteria-based feedback that recognized skill development (and invites reflection).
3. the imperative to develop clinical reasoning - the merging of art and science (aka the grey world we practice in).
(consider One-Minute Preceptor, SNAPPS for quicker methods)
/3
#preceptor #development #PharmRes #TooteRx #ClinicalReasoning
time for my „hello world“ on mastodon. I‘m interested in #education, especially in #physiotherapy, #healthprofessions and in #clinicalreasoning.