Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Jelodar, Hamed; Meymani, Mohammad; Razavi-Far, Roozbeh

Computer Science > Software Engineering

arXiv:2503.17502 (cs)

[Submitted on 21 Mar 2025]

Title:Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Authors:Hamed Jelodar, Mohammad Meymani, Roozbeh Razavi-Far

View PDF HTML (experimental)

Abstract:Large language models (LLMs) and transformer-based architectures are increasingly utilized for source code analysis. As software systems grow in complexity, integrating LLMs into code analysis workflows becomes essential for enhancing efficiency, accuracy, and automation. This paper explores the role of LLMs for different code analysis tasks, focusing on three key aspects: 1) what they can analyze and their applications, 2) what models are used and 3) what datasets are used, and the challenges they face. Regarding the goal of this research, we investigate scholarly articles that explore the use of LLMs for source code analysis to uncover research developments, current trends, and the intellectual structure of this emerging field. Additionally, we summarize limitations and highlight essential tools, datasets, and key challenges, which could be valuable for future work.

Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2503.17502 [cs.SE]
	(or arXiv:2503.17502v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2503.17502

Submission history

From: Hamed Jelodar [view email]
[v1] Fri, 21 Mar 2025 19:29:50 UTC (5,929 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.SE

< prev | next >

new | recent | 2025-03

Change to browse by:

cs
cs.AI
cs.CL

References & Citations

export BibTeX citation

Computer Science > Software Engineering

Title:Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators