A deep language model for software code

Dam, Hoa Khanh; Tran, Truyen; Pham, Trang

Computer Science > Software Engineering

arXiv:1608.02715 (cs)

[Submitted on 9 Aug 2016]

Title:A deep language model for software code

Authors:Hoa Khanh Dam, Truyen Tran, Trang Pham

View PDF

Abstract:Existing language models such as n-grams for software code often fail to capture a long context where dependent code elements scatter far apart. In this paper, we propose a novel approach to build a language model for software code to address this particular issue. Our language model, partly inspired by human memory, is built upon the powerful deep learning-based Long Short Term Memory architecture that is capable of learning long-term dependencies which occur frequently in software code. Results from our intrinsic evaluation on a corpus of Java projects have demonstrated the effectiveness of our language model. This work contributes to realizing our vision for DeepSoft, an end-to-end, generic deep learning-based framework for modeling software and its development process.

Subjects:	Software Engineering (cs.SE); Machine Learning (stat.ML)
Cite as:	arXiv:1608.02715 [cs.SE]
	(or arXiv:1608.02715v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.1608.02715

Submission history

From: Truyen Tran [view email]
[v1] Tue, 9 Aug 2016 08:16:42 UTC (136 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat

< prev | next >

new | recent | 2016-08

Change to browse by:

cs
cs.SE
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hoa Khanh Dam
Truyen Tran
Trang Pham

export BibTeX citation

Computer Science > Software Engineering

Title:A deep language model for software code

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:A deep language model for software code

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators