Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating Racial Bias in LLM Reasoning: Implications for Equitable AI Use in Education
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Large language models (LLMs) with explicit reasoning capabilities represent an important frontier in artificial intelligence development, yet their potential to perpetuate racial bias through displayed chain-of-thought reasoning processes remains understudied. This study provides systematic examination of how racial bias manifests in the step-by-step reasoning of reasoning models when addressing race-related questions, with particular attention to implications for educational technology applications. Using 3,440 race-related questions from the Bias Benchmark for Question Answering (BBQ) dataset, we evaluated reasoning chains generated by DeepSeek-R1-Distill-Llama-8B using an LLM-as-a-judge approach with GPT-4o. Results revealed that 16.4% of questions exhibited racial bias in their reasoning processes, with bias intensifying through reasoning chains—a pattern we term “bias amplification.” Notably, even correctly answered questions (96.5% accuracy) contained biased reasoning steps, with 10% showing at least slight bias, demonstrating “hidden bias” that conventional output-focused evaluations miss. Questions involving academic merit demonstrated the highest average bias scores, raising particular concerns for AI-assisted grading and student assessment applications. Name-based questions accounted for over half of all biased instances despite showing moderate average bias, suggesting implicit racial cues activate stereotyped reasoning. An inverse relationship emerged between reasoning bias and answer accuracy, suggesting that bias detection could improve model performance through selective abstention. These findings have significant implications for educational technology, where students may internalize not only AI-generated content but also the demonstrated problem-solving approaches. Policy recommendations include developing standards that evaluate AI reasoning processes alongside outputs, implementing pre-deployment bias audits for educational AI systems, and establishing transparency requirements for reasoning-displaying models in classroom settings.
Ähnliche Arbeiten
A spreading-activation theory of semantic processing.
1975 · 8.042 Zit.
Cognitive Load During Problem Solving: Effects on Learning
1988 · 7.899 Zit.
International Conference on Learning Representations (ICLR 2013)
2013 · 6.258 Zit.
Learning from delayed rewards
1989 · 5.471 Zit.
Comprehension: A Paradigm for Cognition
1998 · 4.772 Zit.