Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

SMILES challenge 2025: Multitask learning with contrastive and natural language generation for enhanced medical image classification

2026·0 Zitationen·Signal Image and Video ProcessingOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Abstract This article proposes a novel multitask learning framework that integrates contrastive learning and natural language generation (NLG) to enhance medical image classification and report generation. The goal is to improve disease classification accuracy and interpretability in medical diagnostics. The model architecture consists of a Vision Transformer (ViT) as a visual encoder, a transformer-based text encoder, and a multimodal decoder. The visual encoder processes medical images, while the text encoder handles disease-related text prompts. These components are trained jointly using image-text contrastive loss and language generation loss. Evaluations on the MIMICCXR and Chexpert datasets show that the model with NLG (Plain + NLG) outperforms the baseline contrastive learning model (Plain) in disease classification. For example, in the MIMICCXR dataset, the accuracy for Atelectasis increased from 17.44%(Plain) to 41.5% (Plain + NLG), and for Cardiomegaly, it improved from 19.25% to 47.4%. In Chexpert, the accuracy for Atelectasis increased from 12.5% to 58.5%, and for Pleural Effusion, from 61.10% to 64.0%. The model also demonstrated improvements in F1 scores, particularly for complex diseases like Cardiomegaly and Consolidation. The proposed multitask framework effectively combines contrastive learning with NLG, leading to improved disease classification and medical report generation. This approach has potential clinical applications by enhancing AI’s interpretability and accuracy in medical decision-making.

Autoren

Institutionen

Themen

COVID-19 diagnosis using AIMultimodal Machine Learning ApplicationsArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

SMILES challenge 2025: Multitask learning with contrastive and natural language generation for enhanced medical image classification

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen