Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
LLRM: A Robust Approach for Defending Large Language Models Against Prompt Injection Attacks
0
Zitationen
3
Autoren
2026
Jahr
Abstract
Abstract Recently, Large Language Models (LLMs) have inspired a lively field of applications due to their remarkable adeptness in understanding and generating natural language. However, while leveraging the emerging opportunities of their broad appeal across various services, it is crucial to address the new and unknown threats in this domain. According to the OWASP Foundation, Prompt Injection attacks are ranked as the foremost risk related to large language models. Despite recent studies in this field, no comprehensive approach has yet been established to fully harness the potential of LLMs while mitigating vulnerabilities and threats. In this study, we analyze the complexities and consequences of Prompt Injection Cyber Attacks in both standalone LLMs and programs integrated with LLMs. Furthermore, we propose an approach to enhance the security of chatbots based on LLMs by monitoring threats and predicting attacks. The proposed approach demonstrated robust performance in two implementations. In the first implementation, the model achieved an overall accuracy of 91%, and in the second implementation, an accuracy of 99% was obtained. These results highlight the effectiveness of the proposed method in accurately detecting and mitigating threats even under challenging conditions. this research will inspire further investigations to develop stronger defenses against critical threats such as Prompt Injection attacks, paving the way for a more secure era of large language models.
Ähnliche Arbeiten
Rethinking the Inception Architecture for Computer Vision
2016 · 30.699 Zit.
MobileNetV2: Inverted Residuals and Linear Bottlenecks
2018 · 24.991 Zit.
CBAM: Convolutional Block Attention Module
2018 · 21.814 Zit.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020 · 21.500 Zit.
Xception: Deep Learning with Depthwise Separable Convolutions
2017 · 18.707 Zit.