Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The Social Structure of Scientific Evaluation: AI, Benchmarking, and the Deep Learning Monoculture
1
Zitationen
2
Autoren
2026
Jahr
Abstract
Evaluation systems are central organizing institutions in science that coordinate consensus and drive epistemic trajectories. Scientific fields have traditionally relied on "organic" evaluation systems (e.g., peer review, citation) where consensus emerges gradually across multiple epistemic values. This paper highlights artificial intelligence research (AIR) as a potent counterpoint to this model. Drawing on interviews with key actors, computational analyses, and archival materials spanning AIR’s history (1956–2021), we examine how AI evolved from a discipline with weak organic evaluation into a field driven by benchmarking, a “formal” evaluation system that defines progress quantitatively as state-of-the-art accuracy on commercial tasks. We demonstrate that benchmarking came to dominate through an intricate symbiosis with deep learning: benchmarking rewards accuracy, which large-scale deep learning uniquely excelled at, while deep learning’s opacity made organic evaluation increasingly difficult. This symbiosis restructured the field organizationally, epistemically, and materially into a “monoculture” dedicated to scaling. While enabling breakneck progress, monoculture discouraged exploration of alternatives with different epistemic strengths. As AI spreads to other knowledge fields (from science to law to art) benchmarking will accompany it. Our findings thus highlight the risk that formalization of evaluation can lead to monoculture in other creative domains.
Ähnliche Arbeiten
2019 · 31.762 Zit.
Techniques to Identify Themes
2003 · 5.393 Zit.
Answering the Call for a Standard Reliability Measure for Coding Data
2007 · 4.086 Zit.
Basic Content Analysis
1990 · 4.045 Zit.
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
2013 · 3.079 Zit.