OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 14.05.2026, 12:17

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

S3012 Integrating Iterative Large Language Model Collaboration Into Complex Gastrointestinal Case Management: A Pilot Evaluation

2025·0 Zitationen·The American Journal of Gastroenterology
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2025

Jahr

Abstract

Introduction: Gastroenterological emergencies such as acute pancreatitis, Crohn’s disease flare-ups, ischemic colitis, and severe Clostridioides difficile colitis demand swift, guideline-adherent decision-making, often in high-pressure clinical environments. Large language models (LLMs) like ChatGPT and DeepSeek have shown promise in synthesizing medical knowledge but struggle with complex, multimorbid scenarios. This study introduces a collaborative, iterative framework where 2 LLMs refine each other’s clinical management plans to enhance decision quality and guideline adherence. Methods: Four representative, high-acuity gastroenterological cases were developed to simulate real-world complexity. Each case was input independently into ChatGPT (GPT-4) and DeepSeek, producing initial management plans. These were exchanged between models for critique, improvement, and consensus development over up to 5 iterative rounds. Outcome metrics included the Guideline Adherence Score (GAS; 0–5 scale), assessed by expert gastroenterologists and fellows, and inter-model agreement measured via Cohen’s kappa. Thematic analysis using NVivo 12 identified recurring patterns in the refinements. Results: Iterative critique significantly improved guideline adherence across all cases, with mean GAS scores increasing from 3.0 ± 0.8 to 5.0 ± 0.0 (P < 0.001). Initial discrepancies in antibiotic selection, imaging timing, and procedural decisions were resolved through consensus, achieving perfect inter-model agreement post-refinement (κ = 1.0). Thematic analysis highlighted improvements in 4 clinical domains: resuscitation protocols, infection control, procedural decision-making, and long-term management. Notably, AI-generated plans incorporated guideline-consistent strategies such as delayed empiric antibiotic use in pancreatitis, stress-dose steroids for adrenal crisis in Crohn’s flare, and appropriate anticoagulation reversal in ischemic colitis. Conclusion: Collaborative LLM refinement offers a scalable, effective strategy for improving clinical decision-making in complex gastroenterological emergencies. By emulating peer review and multidisciplinary deliberation, this framework enhances accuracy, consistency, and adherence to clinical guidelines. Integration into real-time clinical workflows and further validation in prospective clinical settings could pave the way for AI-augmented precision care.

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareClinical Reasoning and Diagnostic Skills
Volltext beim Verlag öffnen