OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 05.05.2026, 14:39

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

<scp>AI</scp> ‐generated dermatologic images show deficient skin tone diversity and poor diagnostic accuracy: An experimental study

2025·11 Zitationen·Journal of the European Academy of Dermatology and Venereology
Volltext beim Verlag öffnen

11

Zitationen

7

Autoren

2025

Jahr

Abstract

BACKGROUND: Generative AI models are increasingly used in dermatology, yet biases in training datasets may reduce diagnostic accuracy and perpetuate ethnic health disparities. OBJECTIVES: To evaluate two key AI outputs: (1) skin tone representation and (2) diagnostic accuracy of generated dermatologic conditions. METHODS: . Two blinded dermatology residents evaluated a randomized 200-image subset for diagnostic accuracy. An inter-rater kappa statistic was calculated to assess rater agreement. RESULTS: (1) = 0.320, p = 0.572), indicating no meaningful difference between its generated skin tone diversity and census demographics. ChatGPT-4o, Midjourney and Stable Diffusion significantly underrepresented dark skin with Fitzpatrick scores of >IV (6.0%, 3.9% and 8.7% dark skin, respectively; all p < 0.001). Across all platforms, only 15% of images were identifiable by raters as the intended condition. Adobe Firefly had the lowest accuracy (0.94%), while ChatGPT-4o, Midjourney and Stable Diffusion demonstrated higher but still suboptimal accuracy (22%, 12.2% and 22.5%, respectively). CONCLUSIONS: The study highlights substantial deficiencies in the diversity and accuracy of AI-generated dermatological images. AI programs may exacerbate cognitive bias and health inequity, suggesting the need for ethical AI guidelines and diverse datasets to improve disease diagnosis and dermatologic care.

Ähnliche Arbeiten