Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the Quality and Reliability of Large Language Models for Plastic Surgery Patient Education: A Comparative Analysis of ChatGPT and OpenEvidence

2025·1 Zitationen·Aesthetic Surgery Journal

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Abstract Background Concerns regarding information inaccuracy when using general-purpose large language models have prompted the quest for alternative tools. OpenEvidence has emerged as a healthcare-focused large language model trained exclusively on data from peer-reviewed medical literature. Objectives This study compared the quality, accuracy, and readability of aesthetic surgery patient education materials generated by OpenEvidence and ChatGPT. Methods A standardized prompt requesting comprehensive postoperative discharge instructions for 20 of the most common aesthetic surgery procedures was entered into OpenEvidence and ChatGPT-5. Outputs were evaluated using 4 validated assessment tools: the DISCERN instrument for information quality (1-5), the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) for information understandability and actionability (0-100), the Flesch-Kincaid scale for estimated grade level (fifth grade to professional level) and reading ease (0-100), and a Likert scale for citation accuracy (1-4). Results OpenEvidence scored significantly higher than ChatGPT-5 in DISCERN (3.3 ± 0.4 vs 1.7 ± 0.4, P &lt; .001) and the citation accuracy scale (2.4 ± 1.3 vs 1.5 ± 0.7, P = .007). Scores were comparable among both tools in PEMAT-P understandability (71 ± 5 vs 69 ± 0, P = .3) and actionability (52 ± 12 vs 54 ± 5, P = .6), as well as on the Flesch Kincaid Grade Level (9.3 ± 1.0 vs 9.2 ± 0.6, P = .8) and the Flesch Reading Ease Score (40.0 ± 6.6 vs 41.0 ± 5.5, P = .6). Conclusions OpenEvidence generated materials of significantly higher quality and reliability than ChatGPT, suggesting it may serve as a more reliable alternative for patient education in aesthetic surgery practice.

Autoren

Institutionen

NYU Langone Health(US)

Themen

Artificial Intelligence in Healthcare and EducationSocial Media in Health EducationDiversity and Career in Medicine

Volltext beim Verlag öffnen

Evaluating the Quality and Reliability of Large Language Models for Plastic Surgery Patient Education: A Comparative Analysis of ChatGPT and OpenEvidence

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen