Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The CARE framework for AI dataset documentation in clinical laboratories: a comprehensive checklist and data lineage methodology
0
Zitationen
12
Autoren
2026
Jahr
Abstract
OBJECTIVES: Traditional laboratory regulatory frameworks provide robust guidance for conventional clinical testing but lack requirements for data documentation for artificial intelligence and machine learning (AI/ML) solutions. This gap creates significant risks since AI solutions directly inherit patterns, biases, and limitations from their training data. Here, we describe the development and implementation of a comprehensive data documentation checklist with accompanying data lineage for use within an AI lifecycle framework for clinical laboratories. METHODS: Building on the previously established Clinical AI Readiness Evaluator (CARE) framework, we developed a comprehensive data checklist and data lineage methodology through a multiphase process, including (1) comprehensive review of existing AI/ML data documentation frameworks, (2) focused meetings with 3 institutional AI operations teams, and (3) 3 rounds of iterative refinement by our multidisciplinary team. The checklist's effectiveness was then assessed using 3 diverse AI/ML projects moving through the CARE framework. RESULTS: The CARE Data Checklist and Data Lineage provide a structured approach to documenting critical aspects of datasets used in AI/ML projects, including data composition, demographics, collection methods, labeling processes, usage constraints, maintenance requirements, and a data readiness assessment. The checklist addresses unique data-centric challenges of AI/ML applications, facilitating transparency, reproducibility, and regulatory compliance. CONCLUSIONS: The CARE Data Checklist and Data Lineage serve as both a technical guide and a communication tool bridging gaps between technical and clinical stakeholders. By working on these documents early in the AI lifecycle, laboratories can anticipate and address data-related challenges, ultimately saving time, optimizing resources, and improving the reliability of AI-augmented laboratory solutions.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.740 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.649 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.202 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.886 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.