Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Thesis Title Similarity Detection System Using Levenshtein Distance and Cosine Similarity
0
Zitationen
2
Autoren
2025
Jahr
Abstract
The manual verification process of thesis titles in higher education institutions is often time-consuming and prone to oversight, making it difficult to ensure the uniqueness of each student’s work. This poses serious academic risks, as undetected similarities in thesis titles can lead to unintended plagiarism, compromise academic integrity, and undermine the credibility of educational institutions. In a broader sense, repeated or overlapping research topics also reflect a lack of innovation and weaken the scientific contribution of academic programs. To address this issue, an automated detection system is needed to efficiently identify similarities between thesis titles. This study aims to develop a web-based thesis title similarity detection system that integrates Levenshtein Distance and Cosine Similarity algorithms. The system was developed using the Waterfall model, involving stages of requirements analysis, design, implementation, and evaluation. Functional features such as login, title data management, old spelling normalization, and real-time similarity detection were implemented. The results show that the combination of both algorithms effectively detects similarities in character and semantic aspects. The inclusion of an old spelling normalization feature significantly improves detection accuracy by aligning historical and modern word forms prior to analysis. In conclusion, the developed system not only supports a faster and more objective title verification process but also contributes to the prevention of academic plagiarism and promotes integrity in higher education environments.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.628 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.176 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.880 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.