Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The Performance of Artificial Intelligence in Providing Real‐Time Aid in Emergency Dental Trauma: A Clinical Validation Study
3
Zitationen
6
Autoren
2025
Jahr
Abstract
ABSTRACT Background Searching online for dental emergency treatment as a non‐expert can lead to unreliable guidance. We tested the publicly available first multimodal large‐language model, ChatGPT‐4o, prospectively with real emergency‐department avulsion cases to determine if it would deliver guideline‐correct, time‐critical directions within seconds. Methods Seventy‐eight anonymized avulsion charts (42 permanent, 36 primary teeth; 39 dry, 39 moist; 40 immature roots) were rewritten as lay prompts. ChatGPT‐4o created two single responses to each vignette, 14 days apart (156 responses). Three oral and maxillofacial surgeons (OMFS) scored diagnostic accuracy, immediate action, contraindication identification, and completeness. Three lay assessors scored clarity (0–15 composite rating). An additional time‐critical safety flag required simultaneous accuracy in immediate action and contraindication advice. Statistical analysis was performed at a 95% confidence level. Results ChatGPT‐4o demonstrated significant rates of accurate guidance. Inter‐rater reproducibility was near perfect (ICC = 0.94; κ = 0.88–0.998). The median composite score was 13 (IQR 12–14); permanent dentition elevated the probability for perfect diagnostic, contraindication, and immediate‐action scores ( p ≤ 0.046), but extra‐oral dry time lowered immediate‐action ( p = 0.003) and reduced completeness ( p = 0.023). Root maturity had no effect. Clarity was rated at more than 93% in both sessions. The safety flag was present in 81% and 89% of cases ( χ 2 = 6.73, p = 0.009), with one in eight potentially unsafe situations. Conclusions This first clinical validation of ChatGPT‐4o demonstrates expert‐level, reproducible triage for tooth avulsion and introduces the “time‐critical safety” composite as a strict benchmark for emergency chatbots. There is still a need for guideline‐linked retrieval before unsupervised deployment. Clinically, these findings show that while ChatGPT can offer quick and largely accurate advice, the remaining deficiencies highlight the risk of incomplete or unsafe guidance during emergencies.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.628 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.176 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.880 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.