Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
AI-generated Feedback Following Social Robotic Virtual Patient Interactions and Medical Student Performance: Nonrandomized Quasi-Experimental Study (Preprint)
0
Zitationen
12
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Virtual patients (VPs) demonstrate effectiveness in improving clinical reasoning skills; however, traditional VP platforms often lack individualized feedback mechanisms. Advances in large language models (LLMs) enable automated analysis of student-VP interactions, providing scalable feedback on clinical performance. While artificial intelligence (AI)–enhanced social robotic VP platforms show promise for clinical reasoning training, no studies have examined whether AI-generated feedback integrated in such platforms improves clinical performance in standardized assessments. </sec> <sec> <title>OBJECTIVE</title> This study evaluated whether AI-generated postconsultation feedback integrated into social robotic VP interactions improves medical students’ clinical performance, emphasizing medical history taking and communication. </sec> <sec> <title>METHODS</title> A quasi-experimental study with 115 sixth-semester medical students (N=157, 73.2% of eligible students) was conducted at Karolinska Institutet, Stockholm, Sweden, during spring 2025. Students were allocated by hospital site to receive (n=61, 53%) or not receive (n=54, 46.9%) AI-generated feedback following interactions with a Social AI-Enhanced Robotic Interface. All students completed 9 VP cases; the intervention group received approximately 1 page of structured feedback after each VP case. The feedback system used multiple LLMs following a 2-stage algorithm: assessing student-VP dialogues using an assessment rubric, then generating structured feedback on history-taking performance. Both groups participated in case-specific follow-up seminars led by consultant rheumatologists following each VP encounter. Clinical performance was assessed through an 8-minute objective structured clinical examination (OSCE)-based evaluation, with a standardized patient portraying axial spondylarthritis, evaluated by a blinded consultant rheumatologist using a 10-point rubric across 5 domains: communication at consultation start, generic medical history, targeted medical history, diagnostics and management reasoning, and communication at consultation end. </sec> <sec> <title>RESULTS</title> Students receiving AI-generated feedback achieved significantly higher total OSCE scores (mean 7.39, SD 0.86 vs mean 6.68, SD 1.04 points; mean difference 0.70; 95% CI 0.35-1.06; <i>P</i>&lt;.001; Cohen <i>d</i>=0.74). Domain-specific analysis revealed significant improvement in generic medical history after Bonferroni correction (mean 2.46, SD 0.65 vs mean 2.03, SD 0.79 points; <i>P</i>=.004; <i>r</i>=0.27), while other domains showed no significant differences: communication at start (<i>P</i>=.13; <i>r</i>=0.14), targeted medical history taking (<i>P</i>=.60; <i>r</i>=0.05), diagnostics and management (<i>P</i>=.14; <i>r</i>=0.14), and communication at consultation end (<i>P</i>=.31; <i>r</i>=0.09). Pass rates were significantly higher in the feedback group (96.7% vs 79.6%; odds ratio 7.55, 95% CI 1.51-72.2; <i>P</i>=.006), with a number needed to assess of 6 students, that is, for every 6 students receiving feedback, 1 additional student passed the assessment. </sec> <sec> <title>CONCLUSIONS</title> AI-generated feedback following social robotic VP interactions significantly improved medical students’ OSCE-based performance, particularly in generic medical history taking. These findings support integrating validated AI feedback systems as a supplement to expert-led teaching during VP simulations for clinical training and demonstrate the feasibility of scalable, automated feedback in medical education. The domain-specific improvements in generic medical history highlight the importance of targeted, competency-specific feedback design in VP platforms. </sec>
Ähnliche Arbeiten
The Strengths and Difficulties Questionnaire: A Research Note
1997 · 14.611 Zit.
Making sense of Cronbach's alpha
2011 · 13.863 Zit.
QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies
2011 · 13.657 Zit.
A method for estimating the probability of adverse drug reactions
1981 · 11.485 Zit.
Evidence-Based Medicine
1992 · 4.153 Zit.