Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Closing the AI generalisation gap by adjusting for dermatology condition distribution differences across clinical settings

2025·2 Zitationen·EBioMedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BACKGROUND: Generalisation of artificial intelligence (AI) models to a new setting is challenging. In this study, we seek to understand the robustness of a dermatology (AI) model and whether it generalises from telemedicine cases to a new setting including both patient-submitted photographs ("PAT") and clinician-taken photographs in-clinic ("CLIN"). METHODS: A retrospective cohort study involving 2500 cases previously unseen by the AI model, including both PAT and CLIN cases, from 22 clinics in the San Francisco Bay Area, spanning November 2015 to January 2021. The primary outcome measure for the AI model and dermatologists was the top-3 accuracy, defined as whether their top 3 differential diagnoses contained the top reference diagnosis from a panel of dermatologists per case. FINDINGS: The AI performed similarly between PAT and CLIN images (74% top-3 accuracy in CLIN vs. 71% in PAT), however, dermatologists were more accurate in PAT images (79% in CLIN vs. 87% in PAT). We demonstrate that demographic factors were not associated with AI or dermatologist errors; instead several categories of conditions were associated with AI model errors (p < 0.05). Resampling CLIN and PAT to match skin condition distributions to the AI development dataset reduced the observed differences (AI: 84% CLIN vs. 79% PAT; dermatologists: 77% CLIN vs. 89% PAT). We demonstrate a series of steps to close the generalisation gap, requiring progressively more information about the new dataset, ranging from the condition distribution to additional training data for rarer conditions. When using additional training data and testing on the dataset without resampling to match AI development, we observed comparable performance from end-to-end AI model fine tuning (85% in CLIN vs. 83% in PAT) vs. fine tuning solely the classification layer on top of a frozen embedding model (86% in CLIN vs. 84% in PAT). INTERPRETATION: AI algorithms can be efficiently adapted to new settings without additional training data by recalibrating the existing model, or with targeted data acquisition for rarer conditions and retraining just the final layer. FUNDING: Google.

Autoren

Institutionen

Themen

Cutaneous Melanoma Detection and ManagementAI in cancer detectionArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Closing the AI generalisation gap by adjusting for dermatology condition distribution differences across clinical settings

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen