RNNS: Representation Nearest Neighbor Search Black-Box Attack on Code Models

Zhang, Jie; Ma, Wei; Hu, Qiang; Xie, Xiaofei; Traon, Yves Le; Liu, Yang

Computer Science > Cryptography and Security

arXiv:2305.05896v1 (cs)

[Submitted on 10 May 2023 (this version), latest version 18 Oct 2023 (v3)]

Title:RNNS: Representation Nearest Neighbor Search Black-Box Attack on Code Models

Authors:Jie Zhang, Wei Ma, Qiang Hu, Xiaofei Xie, Yves Le Traon, Yang Liu

View PDF

Abstract:Pre-trained code models are mainly evaluated using the in-distribution test data. The robustness of models, i.e., the ability to handle hard unseen data, still lacks evaluation. In this paper, we propose a novel search-based black-box adversarial attack guided by model behaviours for pre-trained programming language models, named Representation Nearest Neighbor Search(RNNS), to evaluate the robustness of Pre-trained PL models. Unlike other black-box adversarial attacks, RNNS uses the model-change signal to guide the search in the space of the variable names collected from real-world projects. Specifically, RNNS contains two main steps, 1) indicate which variable (attack position location) we should attack based on model uncertainty, and 2) search which adversarial tokens we should use for variable renaming according to the model behaviour observations. We evaluate RNNS on 6 code tasks (e.g., clone detection), 3 programming languages (Java, Python, and C), and 3 pre-trained code models: CodeBERT, GraphCodeBERT, and CodeT5. The results demonstrate that RNNS outperforms the state-of-the-art black-box attacking methods (MHM and ALERT) in terms of attack success rate (ASR) and query times (QT). The perturbation of generated adversarial examples from RNNS is smaller than the baselines with respect to the number of replaced variables and the variable length change. Our experiments also show that RNNS is efficient in attacking the defended models and is useful for adversarial training.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
Cite as:	arXiv:2305.05896 [cs.CR]
	(or arXiv:2305.05896v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2305.05896

Submission history

From: Wei Ma [view email]
[v1] Wed, 10 May 2023 04:58:39 UTC (2,492 KB)
[v2] Sat, 20 May 2023 08:47:04 UTC (2,492 KB)
[v3] Wed, 18 Oct 2023 18:01:27 UTC (2,004 KB)

Computer Science > Cryptography and Security

Title:RNNS: Representation Nearest Neighbor Search Black-Box Attack on Code Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:RNNS: Representation Nearest Neighbor Search Black-Box Attack on Code Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators