Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Development and multi-database validation of interpretable machine learning models for predicting In-Hospital mortality in pneumonia patients: A comprehensive analysis across four healthcare systems
7
Zitationen
3
Autoren
2025
Jahr
Abstract
BACKGROUND: Existing machine learning studies for pneumonia mortality prediction are limited by small sample sizes, single-center designs, and lack of comprehensive external validation across diverse healthcare systems. No previous study has systematically validated machine learning models across multiple large-scale databases for pneumonia mortality prediction. METHODS: This retrospective multicenter study utilized four large-scale databases to develop and validate machine learning models for predicting in-hospital mortality in pneumonia patients. MIMIC-IV served as the primary training dataset (9,410 patients), with external validation on MIMIC-III (2,487 patients), eICU (13,541 patients), and an in-house multicenter prospective cohort from fudan university (345 patients). Five algorithms were implemented: Random Forest, XGBoost, Logistic Regression, LASSO, and Support Vector Machine. Feature selection used the Boruta algorithm across 21 variables. Model interpretability was assessed using SHAP analysis. RESULTS: The cohort comprised 25,783 pneumonia patients with mortality rates of 17.1%-38.3% across databases. Nine consistently important features were identified: age, diastolic blood pressure, heart rate, temperature, respiratory rate, creatinine, blood urea nitrogen, platelet count, and white blood cell count. XGBoost achieved optimal performance with training AUC 0.747 (95% CI: 0.733-0.761) and robust external validation AUCs of 0.672 (MIMIC-IV testing), 0.670 (MIMIC-III), 0.695 (eICU), and 0.653 (FAHZU). SHAP analysis revealed platelet count as the most influential predictor, followed by blood urea nitrogen and age. CONCLUSIONS: This study represents the first comprehensive multi-database validation of machine learning models for pneumonia mortality prediction, demonstrating superior performance compared to traditional scoring systems. The XGBoost model with SHAP interpretability provides a robust tool for clinical decision support, with consistent validation across four databases including our in-house prospective cohort.
Ähnliche Arbeiten
The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3)
2016 · 27.540 Zit.
pROC: an open-source package for R and S+ to analyze and compare ROC curves
2011 · 13.846 Zit.
APACHE II
1985 · 13.637 Zit.
Definitions for Sepsis and Organ Failure and Guidelines for the Use of Innovative Therapies in Sepsis
1992 · 13.190 Zit.
The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure
1996 · 11.540 Zit.