Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Comparative Study on the Consistency of GPT-Based AI Grading Using Human-Developed Assessment Criteria

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

The integration of Artificial Intelligence (AI) into educational assessment presents both promising opportunities and notable challenges in evaluating student performance. This study conducts a comparative analysis of ChatGPT-based AI grading systems versus human grading, using structured rubrics as a common framework. Data were collected from two distinct assignments in a computer programming course. Both AI and human graders assessed 20 student submissions. The study utilizes three statistical methods: Intraclass-Correlation Coefficient to evaluate grading consistency, Bland-Altman analysis to measure the agreement between AI and human grades, and paired t-tests to identify significant differences. Results indicate a moderate to high grading consistency for the AI system. While overall agreement with human graders was observed, some discrepancies emerged in specific evaluation criteria. These findings offer valuable insights into the capabilities and current limitations of AI-assisted grading in educational settings.

Autoren

Institutionen

Sultan Qaboos University(OM)

Themen

Artificial Intelligence in Healthcare and EducationIntelligent Tutoring Systems and Adaptive LearningOnline Learning and Analytics

Volltext beim Verlag öffnen

A Comparative Study on the Consistency of GPT-Based AI Grading Using Human-Developed Assessment Criteria

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen