Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Beyond the First Generated Summarization: Comparing Human and AI-Generated Reports in Geriatric Outpatient Consultations on Content, Linguistic Form, and Healthcare Professionals’ Reporting Preferences (Preprint)

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

<sec> <title>BACKGROUND</title> Administrative reporting is a major contributor to healthcare professionals’ (HCPs) workload, available patient time and HCP burnout rates, and large staffing costs. Automated medical reporting (AMR) systems, systems using large language models to automatically generate medical reports, have been proposed as a solution, but their clinical validity remains uncertain, despite their promised benefits. While some studies address content accuracy, linguistic form in AMR reports and their alignment with HCP preferences are understudied. </sec> <sec> <title>OBJECTIVE</title> The goal is 1) to evaluate the first development iteration in terms of content (what is reported), linguistic form (how it is reported), and HCPs’ reporting preferences; and 2) to generate insights for the AMR system’s further development. </sec> <sec> <title>METHODS</title> We evaluated the first development iteration of an AMR system for geriatric outpatient care by focusing on the history taking section (or anamnesis) of the multifaceted and time-consuming Comprehensive Geriatric Assessment (CGA). This first iteration of the CGA reporting model was developed based on the Dutch CGA guidelines and CGA blueprints used in the hospital this study was conducted at. A mixed-methods design was used, employing: (1) content comparison of ten audio-recorded consultations and their corresponding conventional and AMR reports, (2) a focus group in which HCPs discuss reporting differences and HCPs’ reporting preferences were elicited, (3) a linguistic sentence complexity analysis, as a proxy to study linguistic form. </sec> <sec> <title>RESULTS</title> Compared to the conventional reports, AMR reports were shorter, contained less and different information (including hallucinations), repeated more content, had higher sentence complexity, and adhered to a rigid structure. HCPs evaluated AMR as too concise and conclusive, and lacking specific textual elements they deemed important for medical and legal accountability, HCPs appreciated AMR’s structural organization. </sec> <sec> <title>CONCLUSIONS</title> This study yielded multiple insights for the further development of the AMR system used, specifically on information selection, how it is linguistically realized, and HCPs’ reporting preferences. It showcases how not only the content of reports should be evaluated, but that linguistic form and HCP reporting preferences are important to evaluate for AMR implementations as well. This methodology serves as an initial proof-of-concept validation framework that future research can use to evaluate AI technology beyond content-based measures. </sec>

Autoren

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Beyond the First Generated Summarization: Comparing Human and AI-Generated Reports in Geriatric Outpatient Consultations on Content, Linguistic Form, and Healthcare Professionals’ Reporting Preferences (Preprint)

Abstract

Ähnliche Arbeiten

Autoren

Themen