OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 19.04.2026, 22:42

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection

2023·0 Zitationen·Zurich Open Repository and Archive (University of Zurich)Open Access
Volltext beim Verlag öffnen

0

Zitationen

3

Autoren

2023

Jahr

Abstract

The rapid entry of machine learning approaches in our dailyactivities and high-stakes domains demands transparency andscrutiny of their fairness and reliability. To help gauge ma-chine learning models’ robustness, research typically focuseson the massive datasets used for their deployment,e.g., cre-ating and maintaining documentation to understand theirorigin, process of development, and ethical considerations.However, data collection for AI is still typically a one-offpractice, and oftentimes datasets collected for a certain pur-pose or application are reused for a different problem. Addi-tionally, dataset annotations may not be representative overtime, contain ambiguous or erroneous annotations, or be un-able to generalize across domains. Recent research has shownthese practices might lead to unfair, biased, or inaccurate out-comes. We argue that data collection for AI should be per-formed in a responsible manner where the quality of the datais thoroughly scrutinized and measured through a systematicset of appropriate metrics. In this paper, we propose a Re-sponsible AI (RAI) methodology designed to guide the datacollection with a set of metrics for an iterative in-depth analy-sis of thefactors influencing the quality and reliabilityof thegenerated data. We propose a granular set of measurements toinform on theinternal reliabilityof a dataset and itsexternalstabilityover time. We validate our approach across nine ex-isting datasets and annotation tasks and four input modalities.This approach impacts theassessment of data robustnessusedin real world AI applications, where diversity of users andcontent is eminent. Furthermore, it deals with fairness andaccountability aspects in data collection by providing system-atic and transparent quality analysis for data collections.

Ähnliche Arbeiten

Autoren

Themen

Ethics and Social Impacts of AIExplainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen