Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Quality Assessment of Generative AI in Cybersecurity Certification
0
Zitationen
10
Autoren
2026
Jahr
Abstract
Generative Artificial Intelligence (GenAI), particularly Large Language Models (LLMs), is rapidly changing how higher education approaches teaching, learning, and assessment. In cybersecurity education, professional certification exams are key for measuring competence and helping professionals find better job offers, but there is little research on how GenAI systems perform in these exam settings. This study looks at how three popular LLMs, ChatGPT-5, Gemini-2.5 Pro, and Copilot-2.5 Pro, handle 183 practice questions from the CompTIA Security+ certification. The study used a two-phase evaluation: a domain-based assessment and a full-length practice exam that mirrors real certification tests. The researchers measured model performance with accuracy scores, chi-square tests for statistical differences, and an error taxonomy to spot patterns of mistakes important for education. All three GenAI systems scored above the passing mark, and there were no significant differences between them. Still, the error analysis showed ongoing conceptual and classification mistakes that did not show up in the overall accuracy scores. Our results show that GenAI systems can pass structured certification tests, but accuracy by itself does not fully measure professional skills. The study points out important issues for the reliability and validity of AI-based assessments in higher education and stresses the need for more realistic, concept-focused ways to evaluate GenAI in cybersecurity education.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.402 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.270 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.702 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.507 Zit.