Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Artificial intelligence for procedural coding in cardiac critical care: Evaluating large language models for current procedural terminology accuracy

2025·0 Zitationen·Perfusion

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BackgroundAccurate procedural coding is essential for resource allocation, billing integrity, and quality reporting within critical care. Current Procedural Terminology (CPT) coding is largely manual and error-prone, especially in high-acuity environments such as the cardiovascular surgical intensive care unit (CVSICU), where complex procedures like extracorporeal membrane oxygenation (ECMO) are common. Large Language Models (LLMs) may offer scalable solutions for automated coding, but their performance in the CVSICU has not been systematically evaluated.MethodsSix publicly accessible LLMs (GPT-4, Claude 3.7 Sonnet, Perplexity, DeepSeek, Google Gemini 2.5 Pro, Mistral) were tested on CPT code assignment to 47 CVSICU procedures, including 7 ECMO-related interventions, from a single tertiary center (July 2023 to May 2025). Models received prompts in a standardized format and was evaluated based on code accuracy. Statistical comparisons were conducted to assess inter-model performance differences for ECMO and non-ECMO related procedures.ResultsFor non-ECMO procedures, Gemini 2.5 Pro and Perplexity achieved the highest accuracy (88%), followed by Deepseek (78%), Claude 3.7 Sonnet (75%), Mistral (68%), and GPT-4.0 (56%). For ECMO-related codes, Perplexity outperformed all models (86%), followed by Gemini 2.5 Pro (71%), Mistral (43%), DeepSeek (29%), Claude 3.7 Sonnet (14%), and GPT 4.0 (0%). Pairwise comparisons revealed statistically significant inter-model differences.ConclusionsWhile LLMs such as Perplexity and Gemini show promise for automated coding, their limited understanding of context, specifically context-dependent nuances of ECMO, remains a key barrier. Future work should focus on developing domain-specific fine-tuning to capture procedural context before they are employed in high acuity clinical settings.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareElectronic Health Records Systems

Volltext beim Verlag öffnen

Artificial intelligence for procedural coding in cardiac critical care: Evaluating large language models for current procedural terminology accuracy

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen