Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study (Preprint)
0
Zitationen
8
Autoren
2023
Jahr
Abstract
<sec> <title>BACKGROUND</title> Previous research applying large language models (LLMs) to medicine was focused on text-based information. Recently, multimodal variants of LLMs acquired the capability of recognizing images. </sec> <sec> <title>OBJECTIVE</title> We aim to evaluate the image recognition capability of generative pretrained transformer (GPT)-4V, a recent multimodal LLM developed by OpenAI, in the medical field by testing how visual information affects its performance to answer questions in the 117th Japanese National Medical Licensing Examination. </sec> <sec> <title>METHODS</title> We focused on 108 questions that had 1 or more images as part of a question and presented GPT-4V with the same questions under two conditions: (1) with both the question text and associated images and (2) with the question text only. We then compared the difference in accuracy between the 2 conditions using the exact McNemar test. </sec> <sec> <title>RESULTS</title> Among the 108 questions with images, GPT-4V’s accuracy was 68% (73/108) when presented with images and 72% (78/108) when presented without images (<i>P</i>=.36). For the 2 question categories, clinical and general, the accuracies with and those without images were 71% (70/98) versus 78% (76/98; <i>P</i>=.21) and 30% (3/10) versus 20% (2/10; <i>P</i>≥.99), respectively. </sec> <sec> <title>CONCLUSIONS</title> The additional information from the images did not significantly improve the performance of GPT-4V in the Japanese National Medical Licensing Examination. </sec> <sec> <title>CLINICALTRIAL</title> <p /> </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.496 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.386 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.848 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.562 Zit.