OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.05.2026, 12:15

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery

2025·5 Zitationen·Orthopedics
Volltext beim Verlag öffnen

5

Zitationen

6

Autoren

2025

Jahr

Abstract

Background The purpose of this study was to evaluate the performance and evolution of Chat Generative Pre-Trained Transformer (ChatGPT; OpenAI) as a resource for shoulder and elbow surgery information by assessing its accuracy on the American Academy of Orthopaedic Surgeons shoulder-elbow self-assessment questions. We hypothesized that both ChatGPT models would demonstrate proficiency and that there would be significant improvement with progressive iterations. Materials and Methods A total of 200 questions were selected from the 2019 and 2021 American Academy of Orthopaedic Surgeons shoulder-elbow self-assessment questions. ChatGPT 3.5 and 4 were used to evaluate all questions. Questions with non-text data were excluded (114 questions). Remaining questions were input into ChatGPT and categorized as follows: anatomy, arthroplasty, basic science, instability, miscellaneous, nonoperative, and trauma. ChatGPT's performances were quantified and compared across categories with chi-square tests. The continuing medical education credit threshold of 50% was used to determine proficiency. Statistical significance was set at P <.05. Results ChatGPT 3.5 and 4 answered 52.3% and 73.3% of the questions correctly, respectively ( P =.003). ChatGPT 3.5 performed significantly better in the instability category ( P =.037). ChatGPT 4's performance did not significantly differ across categories ( P =.841). ChatGPT 4 performed significantly better than ChatGPT 3.5 in all categories except instability and miscellaneous. Conclusion ChatGPT 3.5 and 4 exceeded the proficiency threshold. ChatGPT 4 performed better than ChatGPT 3.5, showing an increased capability to correctly answer shoulder and elbow-focused questions. Further refinement of ChatGPT's training may improve its performance and utility as a resource. Currently, ChatGPT remains unable to answer questions at a high enough accuracy to replace clinical decision-making. [ Orthopedics . 2025;48(2):e69–e74.]

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationCardiac, Anesthesia and Surgical OutcomesCongenital Heart Disease Studies
Volltext beim Verlag öffnen