RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects

Allam, Ahmed; Shalan, Mohamed

Computer Science > Machine Learning

arXiv:2405.17378 (cs)

[Submitted on 27 May 2024]

Title:RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects

Authors:Ahmed Allam, Mohamed Shalan

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have demonstrated potential in assisting with Register Transfer Level (RTL) design tasks. Nevertheless, there remains to be a significant gap in benchmarks that accurately reflect the complexity of real-world RTL projects. To address this, this paper presents RTL-Repo, a benchmark specifically designed to evaluate LLMs on large-scale RTL design projects. RTL-Repo includes a comprehensive dataset of more than 4000 Verilog code samples extracted from public GitHub repositories, with each sample providing the full context of the corresponding repository. We evaluate several state-of-the-art models on the RTL-Repo benchmark, including GPT-4, GPT-3.5, Starcoder2, alongside Verilog-specific models like VeriGen and RTLCoder, and compare their performance in generating Verilog code for complex projects. The RTL-Repo benchmark provides a valuable resource for the hardware design community to assess and compare LLMs' performance in real-world RTL design scenarios and train LLMs specifically for Verilog code generation in complex, multi-file RTL projects. RTL-Repo is open-source and publicly available on Github.

Subjects:	Machine Learning (cs.LG); Hardware Architecture (cs.AR)
Cite as:	arXiv:2405.17378 [cs.LG]
	(or arXiv:2405.17378v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.17378

Submission history

From: Ahmed Allam [view email]
[v1] Mon, 27 May 2024 17:36:01 UTC (81 KB)

Computer Science > Machine Learning

Title:RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators