Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

No Medical Students Needed? Automating Chart Review With Large Language Models

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

PURPOSE: Manual chart review is an essential but time-intensive aspect of surgical research. While structured data such as laboratory values or demographics can be pulled directly from the electronic health record, many factors exist only in free text notes requiring manual review. Large language models (LLMs), such as ChatGPT and Google's Gemini, can automate this process by "reading" notes to extract data. Prior work shows high accuracy of LLM-automated review for single variables from one clinical note type, for example pathology reports. However, chart reviews generally require extraction of diverse variables from multiple note types. This study evaluates LLM-driven chart review strategies and examines performance across clinical variables. METHODS: Clinical notes from 222 glossectomy patients were collected (2009-2024), including operative, pathology, radiology, and progress notes. Two medical students performed manual chart review of 46 perioperative variables as the reference. Data were extracted using Google's HIPAA-compliant Gemini model (2.0-flash-001) with two prompting strategies: a basic prompt listing the variables to extract, and an extended prompt that included variable definitions, response options, and examples. Each prompt was applied to three input formats: concatenated notes, where all a patient's documentation was combined; grouped notes, where documentation was divided into shorter segments of approximately 3,000 words; and summarized notes, where Gemini first generated a patient summary that was used for variable extraction. Six strategies (2 prompts by 3 note formats) were piloted in 37 patients to compare accuracy, time, and cost. The most effective method was then applied to the full cohort of patients, and accuracy was examined. RESULTS: In the 37 patients, Gemini extraction using extended, or more detailed, prompting outperformed basic prompting, achieving 87.4% accuracy with concatenated notes compared to 69.0% (p 0.01). Extended prompting on summarized notes yielded similarly high accuracy (86.4%, p = 0.4) while being 28.3 times more cost efficient and 1.8 times faster, due to the decreased word count, and therefore processing time and cost, when using short patient summaries rather than the complete documented record. Accuracy differed by variable. Categorical and procedural elements such as type of glossectomy (100%) and prior chemotherapy (95.5%) were most reliable. More nuanced variables such as biopsy diagnosis (94.1%) and depth of invasion (84.2%) were moderately accurate, as they may be documented at multiple points in the disease course, with updated results following additional resections. Inconsistently reported variables such as drinks per week and BMI had the lowest accuracy at 64.9% and 51.4%, respectively. CONCLUSION: Gemini accurately extracted diverse perioperative variables from multiple note types, with extended prompting and summarized inputs providing the best balance of accuracy and efficiency. Variable-level analysis revealed strong performance for operative and treatment details, while clinically nuanced or inconsistently documented variables remained more error-prone. These findings highlight both the promise and limitations of LLM-driven chart review and support their integration into research pipelines to accelerate data abstraction. Beyond research, the accurate patient summaries created may benefit clinicians by streamlining review of complex medical records, improving efficiency in decision-making and patient care. *Source: https://ps-rc.org/meeting/Program/2026/EP98.cgi*

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiology practices and educationAI in cancer detection

Volltext beim Verlag öffnen

No Medical Students Needed? Automating Chart Review With Large Language Models

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen