Understanding Defects in Generated Codes by Language Models

Esfahani, Ali Mohammadi; Kahani, Nafiseh; Ajila, Samuel A.

Computer Science > Software Engineering

arXiv:2408.13372 (cs)

[Submitted on 23 Aug 2024]

Title:Understanding Defects in Generated Codes by Language Models

Authors:Ali Mohammadi Esfahani, Nafiseh Kahani, Samuel A. Ajila

View PDF HTML (experimental)

Abstract:This study investigates the reliability of code generation by Large Language Models (LLMs), focusing on identifying and analyzing defects in the generated code. Despite the advanced capabilities of LLMs in automating code generation, ensuring the accuracy and functionality of the output remains a significant challenge. By using a structured defect classification method to understand their nature and origins this study categorizes and analyzes 367 identified defects from code snippets generated by LLMs, with a significant proportion being functionality and algorithm errors. These error categories indicate key areas where LLMs frequently fail, underscoring the need for targeted improvements. To enhance the accuracy of code generation, this paper implemented five prompt engineering techniques, including Scratchpad Prompting, Program of Thoughts Prompting, Chain-of-Thought Prompting, Chain of Code Prompting, and Structured Chain-of-Thought Prompting. These techniques were applied to refine the input prompts, aiming to reduce ambiguities and improve the models' accuracy rate. The research findings suggest that precise and structured prompting significantly mitigates common defects, thereby increasing the reliability of LLM-generated code.

Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2408.13372 [cs.SE]
	(or arXiv:2408.13372v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2408.13372

Submission history

From: Ali Mohammadi Esfahani [view email]
[v1] Fri, 23 Aug 2024 21:10:09 UTC (97 KB)

Computer Science > Software Engineering

Title:Understanding Defects in Generated Codes by Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Understanding Defects in Generated Codes by Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators