OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 07.04.2026, 17:17

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance Evaluation of Popular Open-Source Large Language Models in Healthcare

2025·2 Zitationen·Studies in health technology and informaticsOpen Access
Volltext beim Verlag öffnen

2

Zitationen

4

Autoren

2025

Jahr

Abstract

This paper evaluated user preferences and performance metrics for two widely used open-source large language models (LLMs), Llama 3.1 8B and Mistral 3 Small 24B (AWQ), compared to the proprietary model GPT-4o, in the context of serving as a user-oriented healthcare assistant. The study highlighted the advantages of open-source LLMs, including transparency, cost-effectiveness, and customization potential for specific applications. A dual approach was used: first, ten participants ranked model-generated responses to various healthcare questions; second, computational performance metrics like response time, throughput, and time-to-first-token were benchmarked under different user loads. Results indicated that the majority of participants preferred GPT-4o responses; however, both open-source LLMs had relatively similar ratings. Furthermore, the benchmarking results underscored the efficiency and reliability of the models under load, showcasing their capabilities for real-world applications. This research contributed to understanding how open-source LLMs could meet the needs of diverse users across different domains, encouraging further exploration and adoption in various industries.

Ähnliche Arbeiten