Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking Testbed
7
Zitationen
3
Autoren
2022
Jahr
Abstract
Failure detection in automated image classification is a critical safeguard for clinical deployment. Detected failure cases can be referred to human assessment, ensuring patient safety in computer-aided clinical decision making. Despite its paramount importance, there is insufficient evidence about the ability of state-of-the-art confidence scoring methods to detect test-time failures of classification models in the context of medical imaging. This paper provides a reality check, establishing the performance of in-domain misclassification detection methods, benchmarking 9 widely used confidence scores on 6 medical imaging datasets with different imaging modalities, in multiclass and binary classification settings. Our experiments show that the problem of failure detection is far from being solved. We found that none of the benchmarked advanced methods proposed in the computer vision and machine learning literature can consistently outperform a simple softmax baseline, demonstrating that improved out-of-distribution detection or model calibration do not necessarily translate to improved in-domain misclassification detection. Our developed testbed facilitates future work in this important area
Ähnliche Arbeiten
SMOTE: Synthetic Minority Over-sampling Technique
2002 · 30.598 Zit.
An introduction to ROC analysis
2005 · 20.957 Zit.
Mining association rules between sets of items in large databases
1993 · 14.778 Zit.
pROC: an open-source package for R and S+ to analyze and compare ROC curves
2011 · 13.798 Zit.
Fast algorithms for mining association rules
1998 · 10.754 Zit.