Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Comparative performance of LLMs and machine learning in predicting complications after percutaneous kyphoplasty for osteoporotic vertebral compression fractures

2026·0 Zitationen·npj Digital MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Exploring large language models (LLMs) performance in the specific medical domain can help understand their generalizability in real-world application. We assessed the predictive and decision-support value of two state-of-the-art LLMs in predicting bone cement leakage (BCL) and new vertebral fractures (NVF) after percutaneous kyphoplasty (PKP) and to compare them with those of traditional machine learning (TML) and spine surgeon. This study utilized combined retrospective and prospective data at a single tertiary hospital. Two LLMs (GPT-5 and DeepSeek R1) with zero- and few-shot strategy, five TML models, and two spine surgeons with/without exposure to LLM responses, were asked to predict complications based on demographic, perioperative baseline, and radiographic data. We also tested LLMs' ability to predict complication subtype. For BCL prediction, both LLMs demonstrated acceptable performance (F1-score, 0.857-0.871; MCC, 0.164-0.332) under zero-shot conditions, comparable to TML models (F1-score, 0.758-0.867; MCC, 0.265-0.416), and slightly superior to surgeons alone (F1-score, 0.675-0.684; MCC, 0.074-0.185). Few-shot prompting enhanced specificity but yielded uncertain overall gains. For NVF prediction, the zero-shot LLM performance was poor (F1-score, 0.309; MCC, 0.044) but improved with few-shot learning. The RBF-SVM model showed the best performance for NVF prediction (F1-score, 0.536; MCC, 0.414). LLM explanations enhanced surgeon performance in BCL prediction but not in NVF. LLMs showed poor prediction of complication subtypes. The findings suggest that current LLMs hold diverse predictive performances for different complications after PKP, they are still immature for real clinical applicability and need further improvement.

Autoren

Institutionen

Themen

Medical Imaging and AnalysisArtificial Intelligence in Healthcare and EducationHip and Femur Fractures

Volltext beim Verlag öffnen

Comparative performance of LLMs and machine learning in predicting complications after percutaneous kyphoplasty for osteoporotic vertebral compression fractures

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen