Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Which current chatbot is more competent in urological theoretical knowledge? A comparative analysis by the European board of urology in-service assessment

2025·11 Zitationen·World Journal of UrologyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

INTRODUCTION: The European Board of Urology (EBU) In-Service Assessment (ISA) test evaluates urologists' knowledge and interpretation. Artificial Intelligence (AI) chatbots are being used widely by physicians for theoretical information. This research compares five existing chatbots' test performances and questions' knowledge and interpretation. MATERIALS AND METHODS: GPT-4o, Copilot Pro, Gemini Advanced, Claude 3.5, and Sonar Huge chatbots solved 596 questions in 6 exams between 2017 and 2022. The questions were divided into two categories: questions that measure knowledge and require data interpretation. The chatbots' exam performances were compared. RESULTS: Overall, all chatbots except Claude 3.5 passed the examinations with a percentage of 60% overall score. Copilot Pro scored best, and Claude 3.5's score difference was significant (71.6% vs. 56.2%, p = 0.001). When a total of 444 knowledge and 152 analysis questions were compared, Copilot Pro offered the greatest percentage of information, whereas Claude 3.5 provided the least (72.1% vs. 57.4%, p = 0.001). This was also true for analytical skills (70.4% vs. 52.6%, p = 0.019). CONCLUSIONS: Four out of five chatbots passed the exams, achieving scores exceeding 60%, while only one did not pass the EBU examination. Copilot Pro performed best in EBU ISA examinations, whereas Claude 3.5 performed worst. Chatbots scored worse on analysis than knowledge questions. Thus, although existing chatbots are successful in terms of theoretical knowledge, their competence in analyzing the questions is questionable.

Autoren

Institutionen

Tekirdağ Namık Kemal University(TR)

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Which current chatbot is more competent in urological theoretical knowledge? A comparative analysis by the European board of urology in-service assessment

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen