Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Deploying and evaluating a conversational agent using LLMs for academic library reference
1
Zitationen
6
Autoren
2026
Jahr
Abstract
Purpose This study has two aims. First, we sought to implement a RAG-based GenAI system capable of answering reference questions. Second, we aimed to develop an evaluation protocol to assess the chatbot by means of comparing implementations that use three different LLMs. An evaluation rubric was piloted to gauge its viability as an assessment tool. Design/methodology/approach The RAG-based chatbot uses a two-step approach. First, in response to a query, the system retrieves relevant documents from a knowledge base. Each document is vectorized and matched by relevance. Second, retrieved data is combined with an LLM's generative capabilities to produce a context-aware response. Fourteen common questions representing different areas of the knowledge base were tested with the chatbot versions. The research team developed and then used an evaluation rubric to score the chatbots' responses according to: accuracy, groundedness, elicitation, completeness and further assistance. The rubric was also evaluated by calculating the standard deviation among reviewers' scores. Findings The RAG implementations were largely successful in restricting the chatbot's responses to the knowledge base. The evaluation rubric was effective for assessing the models, highlighting each's strengths and weaknesses. Despite the evaluation being subjective, the evaluators gave similar scores, with the greatest variation in the elicitation dimension. Originality/value This study offers a technical description of a practical way to implement a RAG-based chatbot in a library setting as well as a protocol for evaluating such chatbots in multiple dimensions that hasn't been discussed in previous literature.
Ähnliche Arbeiten
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller
1999 · 5.632 Zit.
An experiment in linguistic synthesis with a fuzzy logic controller
1975 · 5.572 Zit.
A FRAMEWORK FOR REPRESENTING KNOWLEDGE
1988 · 4.551 Zit.
Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
2023 · 3.403 Zit.