Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing the performance of ChatGPT in psychiatry: A study using clinical cases from foreign medical graduate examination (FMGE)
2
Zitationen
2
Autoren
2024
Jahr
Abstract
Dear Editor, ChatGPT, developed by OpenAI, is an advanced language model designed for natural language understanding and generation.[1] ChatGPT’s role in healthcare involves acting as a conversational agent that can provide information, answer queries, and even assist in preliminary diagnostics. Its ability to process and generate human-like text makes it a valuable tool in the healthcare ecosystem, particularly in fields where effective communication is crucial.[2] In psychiatry, personalized care is essential for understanding the unique needs and conditions of individual patients.[3] ChatGPT can contribute to personalized care by engaging in conversations that help gather patient information, assess mental health symptoms, and provide relevant information about various psychiatric conditions.[4] This study aims to evaluate the performance of ChatGPT in psychiatry using clinical cases sourced from the Foreign Medical Graduate Examination (FMGE). ChatGPT 3.5 – a freely accessible, pre-trained AI model was used for this evaluation. Ten clinical cases were selected from a freely available online source with previous year FMGE questions.[5] All of the cases were multiple-choice questions (MCQs) with four options. The psychiatric conditions covered in the study included a range of disorders to ensure a comprehensive evaluation. Each clinical case, representing a spectrum of mental health disorders such as bulimia nervosa, bipolar disorder, delusions, attention deficit hyperactivity disorder, Fregoli delusion, Othello syndrome, schizophrenia, obsessive-compulsive disorder, and depression, was presented to ChatGPT in the form of MCQs. These cases were given as inputs into ChatGPT, and the responses generated by the model were then cross-checked against the correct diagnoses provided in the FMGE online resource. All responses were generated twice to confirm the answer of ChatGPT. ChatGPT has nailed the evaluation with an overall score of 90% [Supplementary Materials 1-10]. In the majority of the cases (Cases 1–8 and Case 10), ChatGPT provided accurate diagnoses, aligning with the correct answers from the FMGE key. This suggests that ChatGPT has the potential to effectively analyze and comprehend various psychiatric scenarios, providing correct assessments and demonstrating its utility as a tool for preliminary diagnostics in psychiatry. In Case 9, ChatGPT diagnosed the psychiatric condition as “delusion of reference” instead of the correct answer, which was “delusion of control.” The findings of this study hold significant implications for the integration of ChatGPT and similar language models in psychiatric diagnostics. While the majority of cases demonstrated ChatGPT’s competence in providing accurate diagnoses, the misclassification in Case 9 underscores the need for language models to possess an enhanced nuanced understanding of psychiatric disorders, especially those with subtle distinctions. Table 1 shows the evaluation report of ChatGPT responses.Table 1: Evaluation reportThis study emphasizes the necessity for continuous refinement of language models, incorporating feedback from psychiatric professionals and integrating evolving knowledge in the field. Regular updates can address identified limitations, enhancing the model’s accuracy and reliability over time. ChatGPT’s demonstrated accuracy in the majority of cases suggests its potential as a valuable tool for preliminary diagnostics in psychiatry. To mitigate bias, all selected MCQs are sourced exclusively from the National Board of Examinations – FMGE test series, validated for diverse case scenarios and questioning patterns. Additionally, for this evaluation, all MCQs on multiple cases available in the test series have been included. Future assessments will broaden their scope by incorporating MCQs from all psychiatry-related test series asked in the FMGE. Future directions could explore the integration of language models like ChatGPT into clinical workflows, serving as an aid to healthcare professionals in decision-making processes. As language models become more integrated into healthcare, ethical considerations surrounding patient privacy and data security must be prioritized. To enhance ChatGPT, continued training on diverse datasets and rigorous testing with user feedback is crucial. Integration into clinical practice should involve ongoing collaboration with healthcare professionals, ensuring adherence to ethical guidelines, and addressing privacy concerns. Regular updates and refinements based on real-world usage can optimize its utility in supporting clinical tasks. Collaboration between computer scientists, mental health professionals, and policymakers is crucial. A multidisciplinary approach will contribute to the development of guidelines and standards for the ethical and effective use of language models in mental healthcare. Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.460 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.341 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.791 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.536 Zit.