OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 21.05.2026, 11:11

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Multiassessment and Multiprofessional Agents Approach for Medical Chatbot Risk Estimation: Development and Evaluation Study.

2026·0 Zitationen·PubMed
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

BACKGROUND: Assessing chatbot responses across 3 domains-medical, ethical, and legal-is essential to ensuring the safe use of artificial intelligence in health care. Although advancements in the use of large language models (LLMs) show significant improvements in evaluating question-answer datasets, such as multiple-choice medical exams, existing systems use general LLMs without incorporating specialized domain knowledge. They rely on standardized instructions without integrating real-world information, and ensemble methods such as majority voting fail to resolve disagreements among agents, resulting in misclassification and challenges in risk assessment. OBJECTIVE: This study aims to design, develop, and evaluate a synergistic approach for assessing risks associated with chatbot responses using multiassessment (MA) and multiprofessional agents (MPAs). METHODS: -score difference (Δ) as supporting metrics to assess the approach's effectiveness. RESULTS: -score gains ranging from +0.176 to +0.214 across systems. The MPA approach performed better when integrated with MA and external knowledge, with paired bootstrap estimates showing a gain of +0.037 (95% CI 0.003-0.074) over baseline; however, joint accuracy gains were not evident (95% CI -2.9% to 7.7%), and gains relative to the enhanced prompt were small. Notably, MA alone achieved higher joint accuracy than RAG (62.7% vs 60.3%), indicating a metric-specific trade-off rather than consistent superiority across all metrics. CONCLUSIONS: The MA-MPA approach shows potential for improving risk estimation in chatbot responses. The results suggest that the framework is particularly useful for enhancing balanced overall performance, especially when combined with external knowledge, although the medical risk domain remains challenging. Furthermore, more specialized LLMs may further improve contextually grounded risk estimation.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsTopic Modeling
Volltext beim Verlag öffnen