ExecRepoBench: Multi-level Executable Code Completion Evaluation

Yang, Jian; Zhang, Jiajun; Yang, Jiaxi; Jin, Ke; Zhang, Lei; Peng, Qiyao; Deng, Ken; Miao, Yibo; Liu, Tianyu; Cui, Zeyu; Hui, Binyuan; Lin, Junyang

Abstract:Code completion has become an essential tool for daily software development. Existing evaluation benchmarks often employ static methods that do not fully capture the dynamic nature of real-world coding environments and face significant challenges, including limited context length, reliance on superficial evaluation metrics, and potential overfitting to training datasets. In this work, we introduce a novel framework for enhancing code completion in software development through the creation of a repository-level benchmark ExecRepoBench and the instruction corpora Repo-Instruct, aim at improving the functionality of open-source large language models (LLMs) in real-world coding scenarios that involve complex interdependencies across multiple files. ExecRepoBench includes 1.2K samples from active Python repositories. Plus, we present a multi-level grammar-based completion methodology conditioned on the abstract syntax tree to mask code fragments at various logical units (e.g. statements, expressions, and functions). Then, we fine-tune the open-source LLM with 7B parameters on Repo-Instruct to produce a strong code completion baseline model Qwen2.5-Coder-Instruct-C based on the open-source model. Qwen2.5-Coder-Instruct-C is rigorously evaluated against existing benchmarks, including MultiPL-E and ExecRepoBench, which consistently outperforms prior baselines across all programming languages. The deployment of \ourmethod{} can be used as a high-performance, local service for programming development\footnote{\url{this https URL}}.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.11990 [cs.CL]
	(or arXiv:2412.11990v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.11990

Computer Science > Computation and Language

Title:ExecRepoBench: Multi-level Executable Code Completion Evaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators