Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Readability and quality of information of AI-generated patient education materials on familial adenomatous polyposis.
0
Zitationen
6
Autoren
2026
Jahr
Abstract
240 Background: Artificial intelligence (AI) and large language model (LLM) chatbots are increasingly consulted by patients for health information. Prior studies evaluating chatbot-generated patient education materials (PEM) across various medical conditions have shown acceptable quality but poor readability. However, no research to date has examined familial adenomatous polyposis (FAP), a hereditary colorectal cancer syndrome with substantial patient education needs. Methods: An observational cross-sectional study was conducted in August 2025. Fourteen standardized questions about FAP were posed to five AI chatbots: ChatGPT, Microsoft Copilot, Google Gemini, Perplexity, and Claude AI. The fourteen questions were obtained from Cleveland Clinic. PEMs were constructed from each of these 5 models based on their responses to those questions. These chatbot-generated PEMS were then assessed for readability and quality. Readability was assessed using a mean of six validated tools: Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index (GFI), Coleman-Liau Index (CLI), Simple Measure of Gobbledygook (SMOG) Index, and Automated Readability Index (ARI). Quality was assessed with modified versions of two validated instruments: the Patient Education Materials Assessment Tool (PEMAT; 0–100%) and DISCERN (1–5). Results: The mean reading grade level (RGL) across all chatbots was 12.44. RGLs for ChatGPT, Copilot, Gemini, Perplexity, and Claude were 12.56, 11.2, 12.26, 13.50, and 12.68, respectively. The mean DISCERN score across chatbots was 3.97 with scores of 3.85 (ChatGPT), 3.90 (Copilot), 3.95 (Gemini), 3.97 (Perplexity), and 3.79 (Claude). Mean PEMAT scores for understandability and actionability were 89% and 68%. Understandability and actionability scores were: ChatGPT (89%, 75%), Copilot (89%, 75%), Gemini (94%, 58%), Perplexity (89%, 58%), and Claude (89%, 58%). Conclusions: AI chatbots produce PEM of above average quality but poor readability when addressing FAP. The readability level obtained corresponds to at least a college education, which is greater than both the average reading level and the nationally recommended standard for patient education material.
Ähnliche Arbeiten
Improving the Quality of Web Surveys: The Checklist for Reporting Results of Internet E-Surveys (CHERRIES)
2004 · 6.176 Zit.
The content validity index: Are you sure you know what's being reported? critique and recommendations
2006 · 6.155 Zit.
Health literacy and public health: A systematic review and integration of definitions and models
2012 · 5.868 Zit.
Low Health Literacy and Health Outcomes: An Updated Systematic Review
2011 · 5.251 Zit.
Health literacy as a public health goal: a challenge for contemporary health education and communication strategies into the 21st century
2000 · 4.969 Zit.