Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

ClinicRealm: Re-evaluating large language models with conventional machine learning for non-generative clinical prediction tasks

2026·0 Zitationen·npj Digital MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Large Language Models (LLMs) are increasingly deployed in medicine. However, their utility for non-generative clinical prediction is under-evaluated, and they are often assumed to be inferior to specialized models, creating potential for misuse and misunderstanding. To address this, our ClinicRealm benchmark systematically evaluates 15 GPT-style LLMs, 5 BERT-style models, and 11 traditional methods on unstructured clinical notes and structured Electronic Health Records (EHR) across predictive performance, reasoning, fairness, etc. Our findings reveal a significant shift: on clinical notes, leading zero-shot LLMs (e.g., DeepSeek-V3.1-Think, GPT-5) now decisively outperform finetuned BERT models. On structured EHRs, while specialized models excel with ample data, advanced LLMs demonstrate potent zero-shot capabilities, often surpassing conventional models in data-scarce settings. Notably, leading open-source LLMs match or exceed their proprietary counterparts. This provides compelling evidence that modern LLMs are competitive tools for clinical prediction, necessitating a re-evaluation of model selection strategies by health data scientists and developers.

Autoren

Institutionen

Themen

Machine Learning in HealthcareArtificial Intelligence in Healthcare and EducationExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

ClinicRealm: Re-evaluating large language models with conventional machine learning for non-generative clinical prediction tasks

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen