Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Integrating Agentic Artificial Intelligence to Automate International Classification of Diseases, Tenth Revision, Medical Coding
0
Zitationen
6
Autoren
2026
Jahr
Abstract
Automating ICD-10 coding from discharge summaries remains demanding because coders analyze clinical narratives while justifying decisions. This study compares three automation patterns: PLM-ICD as a standalone deep learning system emitting 15 codes per case, LLM-only generation with full autonomy, and a hybrid approach where PLM-ICD drafts candidates for an agentic LLM audit to accept or reject. All strategies were evaluated on 19,801 MIMIC-IV summaries using four LLMs spanning compact (Qwen2.5-3B-Instruct, Llama-3.2-3B-Instruct, Phi-4-mini-instruct) to large-scale (Sonnet-4.5). Precision guided evaluation because coders still supply any missing diagnoses. PLM-ICD alone reached 55.8% precision while always surfacing 15 suggestions. LLM-only generation lagged severely (1.5–34.6% precision) and produced inconsistent output sizes. The agentic audit delivered the best trade-off: compact LLMs reviewed the 15 candidates, discarded weak evidence, and returned 2–8 high-confidence codes. Llama-3.2-3B-Instruct, for example, improved from 1.5% as a generator to 55.1% as a verifier while trimming false positives by 73%. These results show that positioning LLMs as quality controllers, rather than primary generators, yields reliable support for clinical coding teams, while formal recall/F1 reporting remains future work for fully autonomous implementations.
Ähnliche Arbeiten
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
1982 · 21.529 Zit.
Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases
1992 · 10.485 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.467 Zit.
Comorbidity Measures for Use with Administrative Data
1998 · 9.790 Zit.
Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond
2007 · 6.236 Zit.