OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 25.05.2026, 10:32

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluation of ophthalmic large language models: quantitative vs. qualitative methods

2025·0 Zitationen·Current Opinion in Ophthalmology
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2025

Jahr

Abstract

PURPOSE OF REVIEW: Alongside the development of large language models (LLMs) and generative artificial intelligence (AI) applications across a diverse range of clinical applications in Ophthalmology, this review highlights the importance of evaluation of LLM applications by discussing evaluation metrics commonly adopted. RECENT FINDINGS: Generative AI applications have demonstrated encouraging performance in clinical applications of Ophthalmology. Beyond accuracy, evaluation in the form of quantitative and qualitative metrics facilitate a more nuanced assessment of LLM output responses. Several challenges limit evaluation including the lack of consensus on standardized benchmarks, and limited availability of robust and curated clinical datasets. SUMMARY: This review outlines the spectrum of quantitative and qualitative evaluation metrics adopted in existing studies, highlights key challenges in LLM evaluation, to catalyze further work towards standardized and domain-specific evaluation. Robust evaluation to effectively validate clinical LLM applications is crucial in closing the gap towards clinical integration.

Ähnliche Arbeiten