Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Data extraction by generative artificial intelligence: Assessing determinants of accuracy using human-extracted data from systematic review databases.
3
Zitationen
7
Autoren
2025
Jahr
Abstract
Psychological science requires reliable measures. Within systematic literature reviews, reliability hinges on high interrater agreement during data extraction. Yet, the extraction process has been time-consuming. Efforts to accelerate the process using technology have shown limited success until generative artificial intelligence (genAI), particularly large language models (LLMs), accurately extracted variables from medical studies. Nonetheless, for psychological researchers, it remains unclear how to utilize genAI for data extraction, given the range of tested variables, the medical context, and the variability in accuracy. We systematically assessed extraction accuracy and error patterns across domains in psychology by comparing genAI-extracted and human-extracted data from 22 systematic review databases published in the Psychological Bulletin. Eight LLMs extracted 312,329 data points from 2,179 studies on 186 variables. LLM extractions achieved unacceptable accuracy on all metrics for 20% of variables. For 46% of variables, accuracy was acceptable for some metrics and unacceptable for others. LLMs reached acceptable but not high accuracy on all metrics in 15%, high but not excellent in 8%, and excellent accuracy in 12% of variables. Accuracy varied most between variables, less between systematic reviews, and least between LLMs. Moderator analyses using a hierarchical logistic regression, hierarchical linear model, and meta-analysis revealed that accuracy was higher for variables describing studies' context and moderator variables compared to variables for effect size calculation. Also, accuracy was higher in systematic reviews with more detailed variable descriptions and positively correlated with model sizes. We discuss directions for investigating ways to use genAI to accelerate data extractions while ensuring meaningful human control. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.418 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.288 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.726 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.516 Zit.