Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Diagnostic accuracy of DeepSeek-R1 and ChatGPT-4o in emergency patients: A comparative study
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Objective: To compare the diagnostic performance of DeepSeek-R1 and ChatGPT-4o in emergency department inpatients and explore their clinical practical value. Methods: A retrospective study was conducted using clinical data from emergency department inpatients discharged in December 2024. Discharge diagnoses served as the gold standard. Patient data (age, symptoms, exams, tests) were input into DeepSeek-R1 and GPT-4o with the prompt: “What is the most likely diagnosis? ” Two physicians scored outputs (0-3) to assess accuracy and consistency. Results: A total of 328 cases were analyzed. The mean scores for DeepSeek-Rl and ChatGPT-4o were 2.33±1.07 and 2.32±1.05, respectively, with no statistically significant difference ( P =0.82). The Z-score was -0.232, indicating highly similar performance between the two models. However, the rate of accurate diagnoses was 66.5%. Diagnostic performance declined with increasing patient age. Conclusions: DeepSeek-R1 and ChatGPT-4o demonstrated comparable diagnostic performance in emergency department settings, but the misdiagnosis risk remained high. Both models can serve as auxiliary tools to expand physicians' diagnostic considerations but should be integrated with clinical expertise for comprehensive judgment.
Ähnliche Arbeiten
The Strengths and Difficulties Questionnaire: A Research Note
1997 · 14.619 Zit.
Making sense of Cronbach's alpha
2011 · 13.874 Zit.
QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies
2011 · 13.674 Zit.
A method for estimating the probability of adverse drug reactions
1981 · 11.493 Zit.
Evidence-Based Medicine
1992 · 4.156 Zit.