Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating Racial Bias in LLM Reasoning: Implications for Equitable AI Use in Education

2026·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Large language models (LLMs) with explicit reasoning capabilities represent an important frontier in artificial intelligence development, yet their potential to perpetuate racial bias through displayed chain-of-thought reasoning processes remains understudied. This study provides systematic examination of how racial bias manifests in the step-by-step reasoning of reasoning models when addressing race-related questions, with particular attention to implications for educational technology applications. Using 3,440 race-related questions from the Bias Benchmark for Question Answering (BBQ) dataset, we evaluated reasoning chains generated by DeepSeek-R1-Distill-Llama-8B using an LLM-as-a-judge approach with GPT-4o. Results revealed that 16.4% of questions exhibited racial bias in their reasoning processes, with bias intensifying through reasoning chains—a pattern we term “bias amplification.” Notably, even correctly answered questions (96.5% accuracy) contained biased reasoning steps, with 10% showing at least slight bias, demonstrating “hidden bias” that conventional output-focused evaluations miss. Questions involving academic merit demonstrated the highest average bias scores, raising particular concerns for AI-assisted grading and student assessment applications. Name-based questions accounted for over half of all biased instances despite showing moderate average bias, suggesting implicit racial cues activate stereotyped reasoning. An inverse relationship emerged between reasoning bias and answer accuracy, suggesting that bias detection could improve model performance through selective abstention. These findings have significant implications for educational technology, where students may internalize not only AI-generated content but also the demonstrated problem-solving approaches. Policy recommendations include developing standards that evaluate AI reasoning processes alongside outputs, implementing pre-deployment bias audits for educational AI systems, and establishing transparency requirements for reasoning-displaying models in classroom settings.

Autoren

Themen

Intelligent Tutoring Systems and Adaptive LearningArtificial Intelligence in Healthcare and EducationTopic Modeling

Volltext beim Verlag öffnen

Evaluating Racial Bias in LLM Reasoning: Implications for Equitable AI Use in Education

Abstract

Ähnliche Arbeiten

Autoren

Themen