Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Harnessing Transformer Models for Cardiovascular Disease Prediction: A Comparison with Conventional Methods
1
Zitationen
2
Autoren
2025
Jahr
Abstract
Abstract Objective Cardiovascular Diseases (CVDs) remain the leading cause of death worldwide, creating an urgent need for accurate risk prediction. Machine learning (ML) methods are well established, but transformer-based deep learning architectures are emerging as promising alternatives. Their comparative value, particularly under challenges such as class imbalance, is still unclear. Methods We systematically compared transformer models (FT-Transformer, SAINT, TabNet, TabTransformer) with conventional ML algorithms (support vector machine, random forest, XGBoost, etc) using three public CVD datasets of increasing size and complexity: the balanced UCI dataset, the imbalanced Framingham dataset, and the large-scale Kaggle dataset. A consistent preprocessing pipeline was applied, with MICE imputation for missing data and SMOTETomek resampling for imbalance for the Framingham dataset. Models were assessed with stratified 10-fold cross-validation, and their performance was statistically compared across datasets. Explainability was explored using SHAP feature importance. Results Performance varied with dataset characteristics. On the small, balanced UCI dataset, FT-Transformer achieved near-perfect accuracy (AUC > 0.99), comparable to XGBoost and random forest. On the imbalanced Framingham dataset, sensitivity remained low overall, though FT-Transformer achieved the best trade-off. On the Kaggle dataset, FT-Transformer and XGBoost performed similarly, both identifying systolic blood pressure and age as major predictors. Conclusion Transformer models show strong potential for structured health data but remain sensitive to imbalance, where conventional ML retains advantages. Careful dataset-aware model selection is essential for CVD prediction.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.448 Zit.
UCI Machine Learning Repository
2007 · 24.319 Zit.
An introduction to ROC analysis
2005 · 20.796 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.153 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.073 Zit.