Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Harnessing Generative Diversity: An LLM Framework for Consistent Qualitative Assessment
0
Zitationen
4
Autoren
2025
Jahr
Abstract
The stochastic nature of Large Language Models (LLMs) is typically viewed as a reliability challenge for high-stakes tasks like qualitative assessment. We reframe this stochasticity not as random noise, but as ‘generative diversity’: a range of coherent reasoning paths produced by the model. We argue this diversity is a valuable feature, analogous to a committee of experts offering multiple valid perspectives. We propose a novel Human-AI collaboration framework designed specifically to harness this diversity to produce consistent and trustworthy evaluations. We systematically compare three pipelines—zero-shot, rubric-based, and our proposed pairwise comparison. The results show that the pairwise approach effectively distills generative diversity into a stable assessment, achieving a strong correlation with domain expert judgments (r = 0.716). Furthermore, qualitative analysis reveals that while LLMs provide consistent text-grounded baselines, they lack the tacit knowledge to identify exceptional outliers. This work thus proposes a new Human-AI partnership paradigm: by treating generative diversity as a feature to be engineered, we can automate reliable baseline assessments, freeing human experts to focus on high-stakes, nuanced judgments that require contextual insight.
Ähnliche Arbeiten
2019 · 32.200 Zit.
Techniques to Identify Themes
2003 · 5.417 Zit.
Answering the Call for a Standard Reliability Measure for Coding Data
2007 · 4.117 Zit.
Basic Content Analysis
1990 · 4.045 Zit.
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
2013 · 3.130 Zit.