OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.05.2026, 02:22

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Accuracy and reliability of Manus, ChatGPT, and Claude in case-based dental diagnosis

2026·1 Zitationen·Frontiers in Oral HealthOpen Access
Volltext beim Verlag öffnen

1

Zitationen

5

Autoren

2026

Jahr

Abstract

Introduction: Artificial intelligence (AI), particularly large language models (LLMs), is transforming healthcare education and clinical decision-making. While models like ChatGPT and Claude have demonstrated utility in medical contexts, their performance in dental diagnostics remains underexplored; additionally, the potential of emerging platforms, like Manus, is yet to be evaluated. Objective: To compare the diagnostic accuracy and consistency of the ChatGPT, Claude, and Manus-using authentic, case-based dental scenarios. Methods: A set of 117 multiple-choice questions based on validated clinical dental vignettes spanning various specialities was administered to each model under standardised conditions at two separate time points. Responses were scored against expert-validated answer keys. Inter-rater reliability was assessed using Cohen's kappa, and statistical comparisons were made using the chi-square, McNemar, and t-tests. Results: Claude and Manus consistently outperformed ChatGPT across both testing phases. In the second round, Claude and Manus achieved a diagnostic accuracy of 92.3%, compared to ChatGPT's 76.9%. Claude and Manus also demonstrated higher intra-model consistency (Cohen's kappa = 0.714 and 0.782, respectively) than ChatGPT (kappa = 0.560). Although the numerical trends favoured Claude and Manus, pairwise differences in accuracy did not reach statistical significance. Conclusion: Claude and Manus demonstrated numerically higher diagnostic performance and greater response stability compared with ChatGPT; however, these differences did not reach statistical significance and should therefore be interpreted cautiously. This variability across models highlights the need for larger-scale evaluations. These findings underscore the importance of considering both accuracy and consistency when selecting AI tools for integration into dental practice and curricula.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsDental Radiography and Imaging
Volltext beim Verlag öffnen