OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 23.05.2026, 20:57

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories A Large-Scale Comprehensive Measurement of AI-Generated Code in Real-World Repositories

2026·0 Zitationen·arXiv (Cornell University)Open Access
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

Large language models (LLMs) are rapidly transforming software engineering by enabling developers to generate code ranging from small snippets to entire projects. As AI-generated code becomes increasingly integrated into real-world systems, understanding its characteristics and impact is critical. However, prior work primarily focuses on small-scale, controlled evaluations and lacks comprehensive analysis in real-world settings. In this paper, we present a large-scale empirical study of AI-generated code in real-world repositories. We analyze both code-level metrics (\eg complexity, structure, and defect-related indicators) and commit-level characteristics (\eg commit size, frequency, and post-commit stability). To enable this study, we develop heuristic filter with LLM classification to identify AI-generated code and construct a large dataset. Our results provide new insights into how AI-generated code differs from human-written code and how AI assistance influences development practices. These findings contribute to a deeper understanding of the practical implications of AI-assisted programming.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Software Engineering ResearchSoftware Engineering Techniques and PracticesArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen