OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 08.05.2026, 19:40

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A comparison of the psychometric properties of GPT-4 versus human novice and expert authors of clinically complex MCQs in a mock examination of Australian medical students

2025·1 Zitationen·Medical Teacher
Volltext beim Verlag öffnen

1

Zitationen

6

Autoren

2025

Jahr

Abstract

PURPOSE: Creating clinically complex Multiple Choice Questions (MCQs) for medical assessment can be time-consuming . Large language models such as GPT-4, a type of generative artificial intelligence (AI), are a potential MCQ design tool. Evaluating the psychometric properties of AI-generated MCQs is essential to ensuring quality. METHODS: A 120-item mock examination was constructed, containing 40 human-generated MCQs at novice item-writer level, 40 at expert level, and 40 AI-generated MCQs. int. All examination items underwent panel review to ensure they tested higher order cognitive skills and met a minimum acceptable standard. The online mock examination was administered to Australian medical students, who were blinded to each item's author. RESULTS: = 0.382). CONCLUSIONS: The psychometric properties of AI-generated MCQs are comparable to human-generated MCQs at both novice and expert level. Item quality can be improved across all author groups. AI-generated items should undergo human review to enhance distractor efficiency.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsExplainable Artificial Intelligence (XAI)
Volltext beim Verlag öffnen