Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
radMLBench: A dataset collection for benchmarking in radiomics
8
Zitationen
1
Autoren
2024
Jahr
Abstract
BACKGROUND: New machine learning methods and techniques are frequently introduced in radiomics, but they are often tested on a single dataset, which makes it challenging to assess their true benefit. Currently, there is a lack of a larger, publicly accessible dataset collection on which such assessments could be performed. In this study, a collection of radiomics datasets with binary outcomes in tabular form was curated to allow benchmarking of machine learning methods and techniques. METHODS: A variety of journals and online sources were searched to identify tabular radiomics data with binary outcomes, which were then compiled into a homogeneous data collection that is easily accessible via Python. To illustrate the utility of the dataset collection, it was applied to investigate whether feature decorrelation prior to feature selection could improve predictive performance in a radiomics pipeline. RESULTS: A total of 50 radiomic datasets were collected, with sample sizes ranging from 51 to 969 and 101 to 11165 features. Using this data, it was observed that decorrelating features did not yield any significant improvement on average. CONCLUSIONS: A large collection of datasets, easily accessible via Python, suitable for benchmarking and evaluating new machine learning techniques and methods was curated. Its utility was exemplified by demonstrating that feature decorrelation prior to feature selection does not, on average, lead to significant performance gains and could be omitted, thereby increasing the robustness and reliability of the radiomics pipeline.
Ähnliche Arbeiten
TNM Classification of Malignant Tumours
1987 · 16.123 Zit.
A survey on deep learning in medical image analysis
2017 · 14.110 Zit.
Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
2011 · 10.912 Zit.
The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM
2010 · 9.150 Zit.
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
2018 · 8.836 Zit.