Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection
0
Zitationen
3
Autoren
2023
Jahr
Abstract
The rapid entry of machine learning approaches in our dailyactivities and high-stakes domains demands transparency andscrutiny of their fairness and reliability. To help gauge ma-chine learning models’ robustness, research typically focuseson the massive datasets used for their deployment,e.g., cre-ating and maintaining documentation to understand theirorigin, process of development, and ethical considerations.However, data collection for AI is still typically a one-offpractice, and oftentimes datasets collected for a certain pur-pose or application are reused for a different problem. Addi-tionally, dataset annotations may not be representative overtime, contain ambiguous or erroneous annotations, or be un-able to generalize across domains. Recent research has shownthese practices might lead to unfair, biased, or inaccurate out-comes. We argue that data collection for AI should be per-formed in a responsible manner where the quality of the datais thoroughly scrutinized and measured through a systematicset of appropriate metrics. In this paper, we propose a Re-sponsible AI (RAI) methodology designed to guide the datacollection with a set of metrics for an iterative in-depth analy-sis of thefactors influencing the quality and reliabilityof thegenerated data. We propose a granular set of measurements toinform on theinternal reliabilityof a dataset and itsexternalstabilityover time. We validate our approach across nine ex-isting datasets and annotation tasks and four input modalities.This approach impacts theassessment of data robustnessusedin real world AI applications, where diversity of users andcontent is eminent. Furthermore, it deals with fairness andaccountability aspects in data collection by providing system-atic and transparent quality analysis for data collections.
Ähnliche Arbeiten
The global landscape of AI ethics guidelines
2019 · 4.670 Zit.
The Limitations of Deep Learning in Adversarial Settings
2016 · 3.879 Zit.
Trust in Automation: Designing for Appropriate Reliance
2004 · 3.488 Zit.
Fairness through awareness
2012 · 3.298 Zit.
Mind over Machine: The Power of Human Intuition and Expertise in the Era of the Computer
1987 · 3.184 Zit.