OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.04.2026, 00:53

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Dual Fine-Tuning and Dual Alignment Training for Medical Language Models

2025·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2025

Jahr

Abstract

The vast volume of medical information in the real world demands intelligent assistant systems for effective integration and summarization. In response to this, we designed a fourstage training pipeline comprising dual fine-tuning (continued pre-training and supervised fine-tuning) and dual alignment training (Direct Preference Optimization and Proximal Policy Optimization), to build a medical language model based on the open-source LLaMA-3.1-8B. The entire training process adopts the DeepSpeed framework, combined with ZeRO-1 optimization, Quantized Low-Rank Adaptation (QLoRA), and Flash-Attention 2, alleviating the substantial computational costs of large language model (LLM) training. Results show that after the four-stage training process, the medical language model achieved scores of 50.49 on CMMLU and 35.00 on CMB, both outperforming the baseline LLaMA-3.1-8B-Instruct, indicating significant improvements in both general Chinese capabilities and Chinese medical domain capabilities. Furthermore, after dual alignment training, the model's reward score increased by 30%, while toxicity decreased by 25%, demonstrating considerable enhancements in safety and harmlessness.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Machine Learning in HealthcareArtificial Intelligence in Healthcare and EducationTopic Modeling
Volltext beim Verlag öffnen