Convolutional Neural Networks over Tree Structures for Programming Language Processing

Mou, Lili; Li, Ge; Zhang, Lu; Wang, Tao; Jin, Zhi

Computer Science > Machine Learning

arXiv:1409.5718 (cs)

[Submitted on 18 Sep 2014 (v1), last revised 8 Dec 2015 (this version, v2)]

Title:Convolutional Neural Networks over Tree Structures for Programming Language Processing

Authors:Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin

View PDF

Abstract:Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

Comments:	Accepted at AAAI-16
Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Software Engineering (cs.SE)
Cite as:	arXiv:1409.5718 [cs.LG]
	(or arXiv:1409.5718v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1409.5718

Submission history

From: Lili Mou [view email]
[v1] Thu, 18 Sep 2014 06:50:52 UTC (220 KB)
[v2] Tue, 8 Dec 2015 12:31:51 UTC (310 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2014-09

Change to browse by:

cs
cs.NE
cs.SE

References & Citations

DBLP - CS Bibliography

listing | bibtex

Lili Mou
Ge Li
Zhi Jin
Lu Zhang
Tao Wang

export BibTeX citation

Computer Science > Machine Learning

Title:Convolutional Neural Networks over Tree Structures for Programming Language Processing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Convolutional Neural Networks over Tree Structures for Programming Language Processing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators