Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance Evaluation of Popular Open-Source Large Language Models in Healthcare

2025·2 Zitationen·Studies in health technology and informaticsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

This paper evaluated user preferences and performance metrics for two widely used open-source large language models (LLMs), Llama 3.1 8B and Mistral 3 Small 24B (AWQ), compared to the proprietary model GPT-4o, in the context of serving as a user-oriented healthcare assistant. The study highlighted the advantages of open-source LLMs, including transparency, cost-effectiveness, and customization potential for specific applications. A dual approach was used: first, ten participants ranked model-generated responses to various healthcare questions; second, computational performance metrics like response time, throughput, and time-to-first-token were benchmarked under different user loads. Results indicated that the majority of participants preferred GPT-4o responses; however, both open-source LLMs had relatively similar ratings. Furthermore, the benchmarking results underscored the efficiency and reliability of the models under load, showcasing their capabilities for real-world applications. This research contributed to understanding how open-source LLMs could meet the needs of diverse users across different domains, encouraging further exploration and adoption in various industries.

Autoren

Institutionen

University of North Carolina at Chapel Hill(US)

Themen

Artificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Performance Evaluation of Popular Open-Source Large Language Models in Healthcare

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen