Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large Language Models and the North American Pharmacist Licensure Examination (NAPLEX) Practice Questions
9
Zitationen
4
Autoren
2024
Jahr
Abstract
OBJECTIVE: This study aims to test the accuracy of large language models (LLMs) in answering standardized pharmacy examination practice questions. METHODS: tests to compare model and question-type accuracy. RESULTS: Of the 3 LLMs tested, GPT-4 achieved the highest accuracy, with 87% accuracy on the McGraw Hill question set and 83.5% accuracy on the RxPrep question set. In comparison, GPT-3.5 had 68.0% and 60.0% accuracy on those question sets, respectively, and Chatsonic had 60.5% and 62.5% accuracy on those question sets, respectively. All models performed worse on select-all questions compared with non-select-all questions (GPT-3: 42.3% vs 66.2%; GPT-4: 73.1 vs 87.2%; Chatsonic: 36.5% vs 71.6%). GPT-4 had statistically higher accuracy in answering ADR questions (96.1%) compared with non-ADR questions (83.9%). CONCLUSION: Our study found that GPT-4 outperformed GPT-3.5 and Chatsonic in answering North American Pharmacist Licensure Examination pharmacy licensure examination practice questions, particularly excelling in answering questions related to ADRs. These results suggest that advanced LLMs such as GPT-4 could be used for applications in pharmacy education.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.628 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.176 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.880 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.