Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Embedded Hallucination Detection Widgets as UI-Level Model Health Indicators in Web-Based LLM Applications
0
Zitationen
1
Autoren
2025
Jahr
Abstract
Large Language Models (LLMs) deployed in web-based applications are increasingly susceptible to generating hallucinated content outputs that are fluent yet factually incorrect, unsupported, or fabricated. While significant research has focused on backend hallucination detection pipelines, comparatively little attention has been devoted to surfacing model reliability signals directly within the user interface (UI) layer. This paper introduces , a lightweight, embeddable hallucination detection widget framework designed as a real-time, UI-level model health indicator for web-based LLM applications. The proposed system integrates a multi-signal hallucination detection pipeline combining semantic entropy estimation, cross-referential consistency verification, and token-level uncertainty quantification into a modular front-end widget that provides end-users with interpretable, actionable confidence indicators alongside LLM-generated responses. We evaluate across five production-grade LLM backends (GPT-4o, GPT-3.5-Turbo, LLaMA-3-70B, Mistral-Large, and Claude-3-Sonnet) using a curated benchmark of 12,400 query response pairs spanning four high-stakes domains: biomedical question answering, legal document summarization, financial report generation, and educational content synthesis. Our results demonstrate that achieves a hallucination detection F1-score of 0.891 (±0.017), introduces a median latency overhead of only 145 ms per response, and significantly improves end-user trust calibration by 34.7% as measured through a controlled user study ( = 186). Furthermore, we show that the visual affordances reduce user over-reliance on hallucinated content by 41.2% compared to unaugmented interfaces. This work contributes a novel paradigm for treating hallucination detection not merely as a backend audit mechanism but as a first-class UI component integral to responsible AI deployment. DOI: https://doi.org/10.17762/ijisae.v13i2s.8147
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.490 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.376 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.832 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.553 Zit.