Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Empirical assessment of bias in machine learning diagnostic test accuracy studies
48
Zitationen
3
Autoren
2020
Jahr
Abstract
OBJECTIVE: Machine learning (ML) diagnostic tools have significant potential to improve health care. However, methodological pitfalls may affect diagnostic test accuracy studies used to appraise such tools. We aimed to evaluate the prevalence and reporting of design characteristics within the literature. Further, we sought to empirically assess whether design features may be associated with different estimates of diagnostic accuracy. MATERIALS AND METHODS: We systematically retrieved 2 × 2 tables (n = 281) describing the performance of ML diagnostic tools, derived from 114 publications in 38 meta-analyses, from PubMed. Data extracted included test performance, sample sizes, and design features. A mixed-effects metaregression was run to quantify the association between design features and diagnostic accuracy. RESULTS: Participant ethnicity and blinding in test interpretation was unreported in 90% and 60% of studies, respectively. Reporting was occasionally lacking for rudimentary characteristics such as study design (28% unreported). Internal validation without appropriate safeguards was used in 44% of studies. Several design features were associated with larger estimates of accuracy, including having unreported (relative diagnostic odds ratio [RDOR], 2.11; 95% confidence interval [CI], 1.43-3.1) or case-control study designs (RDOR, 1.27; 95% CI, 0.97-1.66), and recruiting participants for the index test (RDOR, 1.67; 95% CI, 1.08-2.59). DISCUSSION: Significant underreporting of experimental details was present. Study design features may affect estimates of diagnostic performance in the ML diagnostic test accuracy literature. CONCLUSIONS: The present study identifies pitfalls that threaten the validity, generalizability, and clinical value of ML diagnostic tools and provides recommendations for improvement.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.778 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.690 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.259 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.901 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.