Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
When Judgement Does Not Stay the Same
0
Zitationen
1
Autoren
2026
Jahr
Abstract
Beyond the Average Research Series – Working Paper Description This working paper examines judgement stability in AI systems under repeated evaluation. It builds on the Behavioural Evaluation Framework (Hull, 2026), extending the conceptual approach through empirical observation. The analysis draws on the Phase 4 behavioural evaluation study within the Agents at Work research series (Hull, 2025–2026), which examined how large language models interpret age-coded language in recruitment text and how those judgements behave when the same evaluative task is repeated. The paper focuses on how classification outcomes vary under identical conditions, with particular attention to the structure and distribution of variation across repeated executions. Abstract Large language models are increasingly used to perform evaluative or judgement-based tasks, including classification, moderation, and analytical assessment. While existing approaches to evaluation often focus on individual outputs, such observations provide limited insight into how systems behave when the same task is repeated. Building on the Behavioural Evaluation Framework, this paper examines judgement stability under repeated execution. Using a series of repeated evaluations of recruitment text, the analysis explores how classification outcomes vary under identical conditions. The findings indicate that variation is not random but concentrated at decision boundaries, particularly between adjacent categories such as “Potentially Biased” and “Unclear”. In these cases, the system often identifies similar cues and produces comparable reasoning, while the final classification varies. These observations suggest that instability in AI judgement reflects structured sensitivity to interpretation rather than isolated error. The paper argues that reliability in AI judgement systems is better understood through patterns of behaviour across repeated evaluations than through the inspection of individual outputs. Note This paper is released as a working paper to present empirical findings on judgement stability within the Behavioural Evaluation Framework. It extends earlier conceptual work by examining how variation emerges under repeated execution. Future work will explore additional behavioural properties of AI judgement systems, including confidence behaviour, explanation stability, and sensitivity to input variation, as part of the ongoing Agents at Work research series.
Ähnliche Arbeiten
The global landscape of AI ethics guidelines
2019 · 4.726 Zit.
The Limitations of Deep Learning in Adversarial Settings
2016 · 3.886 Zit.
Trust in Automation: Designing for Appropriate Reliance
2004 · 3.513 Zit.
Fairness through awareness
2012 · 3.302 Zit.
AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations
2018 · 3.203 Zit.