Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative analysis of large language models and clinician responses in patient blood management knowledge
2
Zitationen
14
Autoren
2025
Jahr
Abstract
BACKGROUND: Large language models (LLMs) are increasingly used in the medical field and have the potential to reduce workload and improve treatment procedures in clinical practice. This study evaluates the capabilities of LLMs to answer common questions related to patient blood management (PBM) and compares their performance to the expertise of clinicians from two university hospitals. METHODS: To evaluate the performance of ChatGPT-3.5, ChatGPT-4o, and Google Gemini in answering PBM-related questions, we used a representative sample of 40 questions (30 single-choice and 10 frequently asked patient questions) and compared their responses to those of clinicians. The accuracy and interrater reliability of the answers were analyzed. RESULTS: For PBM knowledge-based questions, the proportion of correct answers was 96.4% (95% CI: 93.6-98.0%) for ChatGPT-4o, 81.3% (95% CI: 77.0-85.7%) for ChatGPT-3.5, and 84.0% (95% CI: 79.4-87.7%) for Google Gemini. Clinicians (N.=82) provided correct answers to 76.5% (95% CI: 74.7-78.1%) of the questions. For frequently asked patient questions, the proportion of correct answers was 100% for ChatGPT-4o, 95.5% (95% CI: 91.4-99.6%) for ChatGPT-3.5 and 91.7% (95% CI: 86.0-97.4%) for Google Gemini. Clinicians provided correct answers to 62.0% (95% CI: 58.7-65.3%) of the questions. Across the categories -anemia management, iron supplementation, cell salvage, principles of PBM, and blood transfusion- ChatGPT-4o achieved the highest scores, providing the most correct answers. CONCLUSIONS: LLMs show strong potential for delivering accurate and comprehensive responses to common PBM-related questions. However, it remains essential for clinicians and patients to verify responses, particularly in critical situations.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.774 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.685 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.244 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.