Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Dissecting HealthBench: Disease Spectrum, Clinical Diversity, and Data Insights from Multi-Turn Clinical AI Evaluation Benchmark

2025·2 Zitationen·Journal of Medical SystemsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

HealthBench is an open-source, large-scale benchmark consisting of 5,000 multi-turn clinical conversations evaluated against 48,562 criteria developed by clinicians. Recognized as a significant advancement in assessing realistic artificial intelligence (AI) models, HealthBench deserves further exploration. In this article, we systematically analyze the benchmark's disease spectrum, diagnostic and therapeutic focuses, and demographic diversity. We evaluate its representativeness and strengths, as well as the essential limitations that AI researchers and clinicians should consider when using it for realistic model evaluations.

Autoren

Institutionen

Themen

Machine Learning in HealthcareHealth Systems, Economic Evaluations, Quality of LifeArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Dissecting HealthBench: Disease Spectrum, Clinical Diversity, and Data Insights from Multi-Turn Clinical AI Evaluation Benchmark

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen