Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The influence of annotators' experience on radiomics-based machine learning performance in colorectal liver metastases characterization: Impact and mitigation strategy
0
Zitationen
9
Autoren
2026
Jahr
Abstract
Background: Segmentation variability is a major source of bias in radiomics, yet its quantitative impact on downstream model performance remains poorly defined.This study aimed to assess how annotator expertise influences model generalization and to test a mitigation strategy based on a observer fingerprint correction.Methods: An open-source CT dataset including 93 colorectal liver metastases with 46 desmoplastic and 47 replacement growth patterns was annotated independently by four observers with different expertise (radiologist, PhD student, two medical students).A standardized radiomics pipeline extracted 107 features and a machine-learning model was developed using nested cross-validation.Each model was evaluated in two setups: a standard pipeline with z-score normalization based on the training annotator, and a mitigation setup using external observer-specific normalization ("observer fingerprint").Statistical comparisons used Wilcoxon signed-rank tests with rank-biserial correlation and common language effect size.Results: In the first setup, the best performing test set annotator was the radiologist for both the models trained on radiologist, PhD and STUD1 annotations with an AUC-ROC median [IQR] of 0.79 [0.76, 0.81], 0.74 [0.70, 0.76], and 0.77 [0.74, 0.80] respectively.For the model trained on STUD2 annotations, PhD was the best test set annotator with an AUC-ROC of 0.72 [0.70, 074].In the second setup, the best performing test set annotator was RAD for all the models.The mitigation strategy significantly increased median AUC-ROC in the majority of cross-annotator comparisons (p<0.001).Conclusions: Model performance depends not only on the annotator used for training but also on the operator performing segmentation at deployment.The proposed mitigation strategy effectively reduced cross-annotator performance variability.
Ähnliche Arbeiten
New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)
2008 · 29.099 Zit.
TNM Classification of Malignant Tumours
1987 · 16.123 Zit.
A survey on deep learning in medical image analysis
2017 · 13.819 Zit.
Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
2011 · 10.846 Zit.
The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM
2010 · 9.127 Zit.