Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Development and multi-database validation of interpretable machine learning models for predicting In-Hospital mortality in pneumonia patients: A comprehensive analysis across four healthcare systems

2025·7 Zitationen·Respiratory ResearchOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BACKGROUND: Existing machine learning studies for pneumonia mortality prediction are limited by small sample sizes, single-center designs, and lack of comprehensive external validation across diverse healthcare systems. No previous study has systematically validated machine learning models across multiple large-scale databases for pneumonia mortality prediction. METHODS: This retrospective multicenter study utilized four large-scale databases to develop and validate machine learning models for predicting in-hospital mortality in pneumonia patients. MIMIC-IV served as the primary training dataset (9,410 patients), with external validation on MIMIC-III (2,487 patients), eICU (13,541 patients), and an in-house multicenter prospective cohort from fudan university (345 patients). Five algorithms were implemented: Random Forest, XGBoost, Logistic Regression, LASSO, and Support Vector Machine. Feature selection used the Boruta algorithm across 21 variables. Model interpretability was assessed using SHAP analysis. RESULTS: The cohort comprised 25,783 pneumonia patients with mortality rates of 17.1%-38.3% across databases. Nine consistently important features were identified: age, diastolic blood pressure, heart rate, temperature, respiratory rate, creatinine, blood urea nitrogen, platelet count, and white blood cell count. XGBoost achieved optimal performance with training AUC 0.747 (95% CI: 0.733-0.761) and robust external validation AUCs of 0.672 (MIMIC-IV testing), 0.670 (MIMIC-III), 0.695 (eICU), and 0.653 (FAHZU). SHAP analysis revealed platelet count as the most influential predictor, followed by blood urea nitrogen and age. CONCLUSIONS: This study represents the first comprehensive multi-database validation of machine learning models for pneumonia mortality prediction, demonstrating superior performance compared to traditional scoring systems. The XGBoost model with SHAP interpretability provides a robust tool for clinical decision support, with consistent validation across four databases including our in-house prospective cohort.

Autoren

Institutionen

Themen

Sepsis Diagnosis and TreatmentMachine Learning in HealthcareArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Development and multi-database validation of interpretable machine learning models for predicting In-Hospital mortality in pneumonia patients: A comprehensive analysis across four healthcare systems

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen