Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Step into the era of large multimodal models: a pilot study on ChatGPT-4V(ision)’s ability to interpret radiological images

2024·41 Zitationen·International Journal of SurgeryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

BACKGROUND: The introduction of ChatGPT-4V's 'Chat with images' feature represents the beginning of the era of large multimodal models (LMMs), which allows ChatGPT to process and answer questions based on uploaded images. This advancement has the potential to transform how surgical teams utilize radiographic data, as radiological interpretation is crucial for surgical planning and postoperative care. However, a comprehensive evaluation of ChatGPT-4V's capabilities in interpret radiological images and formulating treatment plans remains to be explored. PATIENTS AND METHODS: Three types of questions were collected: (1) 87 USMLE-style questions, submitting only the question stems and images without providing options to assess ChatGPT's diagnostic capability. For questions involving treatment plan formulations, a five-point Likert scale was used to assess ChatGPT's proposed treatment plan. The 87 questions were then adapted by removing detailed patient history to assess its contribution to diagnosis. The diagnostic performance of ChatGPT-4V was also tested when only medical history was provided. (2) We randomly selected 100 chest radiography from the ChestX-ray8 database to test the ability of ChatGPT-4V to identify abnormal chest radiography. (3) Cases from the 'Diagnose Please' section in the Radiology journal were collected to evaluate the performance of ChatGPT-4V in diagnosing complex cases. Three responses were collected for each question. RESULTS: ChatGPT-4V achieved a diagnostic accuracy of 77.01% for USMLE-style questions. The average score of ChatGPT-4V's treatment plans was 3.97 (Interquartile Range: 3.33-4.67). Removing detailed patient history dropped the diagnostic accuracy to 19.54% (P<0.0001). ChatGPT-4V achieved an AUC of 0.768 (95% CI: 0.684-0.851) in detecting abnormalities in chest radiography, but could not specify the exact disease due to the lack of detailed patient history. For cases from 'Diagnose Please' ChatGPT provided diagnoses consistent with or very similar to the reference answers. CONCLUSION: ChatGPT-4V demonstrated an impressive ability to combine patient history with radiological images to make diagnoses and directly design treatment plans based on images, suggesting its potential for future application in clinical practice.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiology practices and educationCOVID-19 diagnosis using AI

Volltext beim Verlag öffnen

Step into the era of large multimodal models: a pilot study on ChatGPT-4V(ision)’s ability to interpret radiological images

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen