Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Zhou, Yaqin; Liu, Shangqing; Siow, Jingkai; Du, Xiaoning; Liu, Yang

Computer Science > Software Engineering

arXiv:1909.03496 (cs)

[Submitted on 8 Sep 2019]

Title:Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Authors:Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, Yang Liu

View PDF

Abstract:Vulnerability identification is crucial to protect the software systems from attacks for cyber security. It is especially important to localize the vulnerable functions among the source code to facilitate the fix. However, it is a challenging and tedious process, and also requires specialized security expertise. Inspired by the work on manually-defined patterns of vulnerabilities from various code representation graphs and the recent advance on graph neural networks, we propose Devign, a general graph neural network based model for graph-level classification through learning on a rich set of code semantic representations. It includes a novel Conv module to efficiently extract useful features in the learned rich node representations for graph-level classification. The model is trained over manually labeled datasets built on 4 diversified large-scale open-source C projects that incorporate high complexity and variety of real source code instead of synthesis code used in previous works. The results of the extensive evaluation on the datasets demonstrate that Devign outperforms the state of the arts significantly with an average of 10.51% higher accuracy and 8.68\% F1 score, increases averagely 4.66% accuracy and 6.37% F1 by the Conv module.

Comments:	accepted by NeurIPS 2019
Subjects:	Software Engineering (cs.SE); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1909.03496 [cs.SE]
	(or arXiv:1909.03496v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.1909.03496

Submission history

From: Yaqin Zhou [view email]
[v1] Sun, 8 Sep 2019 16:14:31 UTC (326 KB)

Computer Science > Software Engineering

Title:Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators