Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study

2025·7 Zitationen·Journal of Medical Internet ResearchOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

To our knowledge, this is the first comprehensive evaluation of the general psychiatric knowledge encoded in commercially available LLMs and the first study to assess their reliability and identify predictors of response accuracy within medical domains. The findings suggest that GPT-4 and GPT-4o encode accurate and reliable general psychiatric knowledge and that methods, such as repeated prompting, may provide a measure of LLM response confidence. This work supports the potential of LLMs in mental health settings and motivates further research to assess their performance in more open-ended clinical contexts.

Autoren

Institutionen

University of California, San Francisco(US)

Themen

Artificial Intelligence in Healthcare and EducationTopic ModelingMachine Learning in Healthcare

Volltext beim Verlag öffnen

Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen