Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating the effect of mental health fine-tuning relative to other model characteristics on LLM safety performance
0
Zitationen
6
Autoren
2026
Jahr
Abstract
Abstract Large language models (LLMs) are increasingly used in mental health applications, yet it remains unclear whether mental health–specific fine-tuning meaningfully improves safety-relevant performance beyond gains from model scale or architecture. We evaluated 127 publicly available open-source LLMs across three model families, multiple architecture generations, parameter scales (270M–70B), and fine-tuning strategies on three psychiatrist-reviewed synthetic classification tasks: suicidal ideation detection, identification of user requests for therapy, and detection of explicit therapy-like interactions in multi-turn conversations. Performance was summarized using F1 score, with multivariable regression and paired comparisons used to estimate independent effects of model characteristics. Across tasks, newer architectures and larger models consistently showed superior performance. General instruction tuning improved detection of therapy requests and engagement, whereas mental health–specific, medical, or safety fine-tuning conferred no consistent benefit and were sometimes associated with reduced performance. These findings suggest that baseline model capability is more consequential than domain-specific fine-tuning for certain safety-relevant mental health classification tasks, underscoring the importance of careful model selection and task-specific evaluation.
Ähnliche Arbeiten
The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods
2009 · 5.699 Zit.
The Stress Process
1981 · 4.465 Zit.
Mental health problems and social media exposure during COVID-19 outbreak
2020 · 2.792 Zit.
Psychological Aspects of Natural Language Use: Our Words, Our Selves
2002 · 2.550 Zit.
Emotion: A Psychoevolutionary Synthesis
1980 · 2.524 Zit.