Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Analyzing the Role of AI in Resident Education: An Evaluation of ChatGPT on Ophthalmology Trainee Examination Questions by Subtopic

2025·0 Zitationen·Journal of Academic OphthalmologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Background: ChatGPT is a large-scale language model trained on various datasets to learn, analyze, and generate human-like answers to user’s questions. To assess its applicability to medical education, more information is required to understand whether its analyses can provide accurate and coherent responses to questions. The aim of this study was to characterize ChatGPT responses to ophthalmology questions according to subtopic to determine where the system might be used reliably in resident education and where its performance remains weak. Methods: Ophthalmology questions were obtained from a widely utilized study resource, OphthoQuestions. Thirteen sections, each with a differing ophthalmic subtopic, were sampled, and questions were collected from each section. Questions containing images or tables were excluded. Of 163 questions and their respective answer choices, 131 were input into ChatGPT-3.5. The accuracy of ChatGPT by subtopic was analyzed using Excel. ChatGPT responses were evaluated via the properties of natural coherence. Incorrect responses were categorized as logical fallacy, informational fallacy, or explicit fallacy. Statistical significance of categorical variables were analyzed using the χ2 test. Results: ChatGPT answered 71 of 131 questions correctly (54.2%). Accuracy in each subtopic was as follows: general medicine (90%), oculoplastics (70%), retina and vitreous (70%) cornea (30%), fundamentals (40%), optics (40%), pediatrics (40%), glaucoma (50%), lens and cataract (50%), neuro-ophthalmology (60%), pathology and tumors (60%), refractive surgery (55), and uveitis (50%). Logical reasoning, internal information, and external information were identified in 82.4%, 100%, and 83.2% of the responses, respectively. The use of logical reasoning (P = 0.003) and external information (P = 0.02) was found to be statistically significant when stratified by correct and incorrect responses. Conclusion: ChatGPT scored higher in general medicine, oculoplastics, and retina and vitreous than in cornea, fundamentals, optics, and pediatrics. Identifying subtopics in which ChatGPT performs less well allows learners to acquire appropriate supplemental resources in these areas.

Autoren

Themen

Artificial Intelligence in Healthcare and EducationOphthalmology and Visual Health Research

Volltext beim Verlag öffnen

Analyzing the Role of AI in Resident Education: An Evaluation of ChatGPT on Ophthalmology Trainee Examination Questions by Subtopic

Abstract

Ähnliche Arbeiten

Autoren

Themen