OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 22.05.2026, 15:36

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessment of ChatGPT’s adherence to evidence-based clinical practice guidelines for plantar fasciitis management

2025·3 Zitationen·Journal of Orthopaedic Surgery and ResearchOpen Access
Volltext beim Verlag öffnen

3

Zitationen

6

Autoren

2025

Jahr

Abstract

PURPOSE: This study aimed to test the multidimensional performance of Chat-Generative Pre-trained Transformer (ChatGPT) in generating recommendations for the management of plantar fasciitis (PF) that adhere to well-established clinical practice guidelines. MATERIALS AND METHODS: 21 queries were raised from the 2023 APTA guideline recommendations for PF and prompted into ChatGPT-4o and ChatGPT-4 Turbo. Two experienced orthopaedic physicians evaluated the responses for accuracy, consistency, self-awareness, and fabrication and falsification using five-point Likert scales. The group-wise comparisons were conducted between the two models and subgroups. RESULTS: The interrater agreement between evaluators was moderate to good (intraclass correlation coefficients of 0.573-0.757). Both versions of ChatGPT were outperformed and comparable across all dimensions, including accuracy ([4.1 ± 0.8] vs. [4.1 ± 0.7], P = 0.959), consistency ([4.6 ± 0.5] vs. [4.6 ± 0.6], P = 0.890), self-awareness ([4.3 ± 0.6] vs. [4.5 ± 0.5], P = 0.407), and fabrication and falsification ([4.6 ± 0.6] vs. [4.5 ± 0.4], P = 0.681). In the subgroup comparisons, better performance was identified in closed-ended questions and for positive rather than negative recommendations (P < 0.05). No significant differences were found between recommendation strength subgroups, except in fabrication and falsification ([4.4 ± 0.6] vs. [5.0 ± 0], P = 0.001). CONCLUSIONS: The two mainstream versions of ChatGPT showed comparable and superior performance in generating recommendations concordant with clinical guidelines for PF management. However, notable specific issues included performance variations between different prompt strategies, recommendation grades, and recommendation type, and the models should still be utilized with caution.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Tendon Structure and TreatmentArtificial Intelligence in Healthcare and EducationDupuytren's Contracture and Treatments
Volltext beim Verlag öffnen