Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Revolutionizing Educational Assessment: The Role of Bing Chat GPT-4 Chatbot in Increasing Efficiency in Grading Open-Ended Questions
0
Zitationen
7
Autoren
2025
Jahr
Abstract
Background Bing Chat (Microsoft Corporation, Redmond, WA) is an artificial intelligence (AI) program that can respond to typed text. Many educational institutions are exploring the incorporation of AI into their materials, given its rapid advancements and potential to streamline various processes. One area that still requires investigation is the ability of AI models to grade open-ended questions. The purpose of this study is to determine whether Bing Chat or university faculty exhibit greater consistency in grading such questions. Methods The authors recruited 21 medical students from the American University of the Caribbean (AUC) to answer five open-ended questions related to the United States Medical Licensing Examination (USMLE) Step 1 topics. The volunteer participants consisted of first-year and second-year medical students. The responses of each student to the five questions were graded by six different Bing Chat accounts and six faculty members. Differences in scores between Bing Chat and the faculty members were compared, and inter-rater reliability estimates were calculated. Results Both Bing Chat and faculty consistently measured the same responses; although there was some variability in both cases, it was more pronounced in faculty grading. For analysis, problem-solving questions with elements of application, explanatory, and recall-type questions, Bing Chat's grading closely paralleled that of the faculty. However, when grading a combined recall-and-application question, a significant gap between Bing Chat and faculty scores was observed (p = 0.010). Overall, Bing Chat demonstrated higher inter-rater reliability than faculty, as evidenced by both percent agreement and Gwet's agreement coefficient 1 (AC1). Conclusion Bing Chat demonstrated promising results in evaluating written answers to open-ended questions and shows potential as a supportive grading tool. As educational leaders seek more dependable, faster, and economical methods for assessment, Bing Chat may offer a notable contribution to education. Large language models (LLMs), such as Bing Chat, can be beneficial to both students and educators.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.422 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.300 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.734 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.519 Zit.