BBTv2: Towards a Gradient-Free Future with Large Language Models

Sun, Tianxiang; He, Zhengfu; Qian, Hong; Zhou, Yunhua; Huang, Xuanjing; Qiu, Xipeng

Computer Science > Computation and Language

arXiv:2205.11200 (cs)

[Submitted on 23 May 2022 (v1), last revised 14 Oct 2022 (this version, v2)]

Title:BBTv2: Towards a Gradient-Free Future with Large Language Models

Authors:Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

View PDF

Abstract:Most downstream adaptation methods tune all or part of the parameters of pre-trained models (PTMs) through gradient descent, where the tuning cost increases linearly with the growth of the model size. By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment. Though, past work on gradient-free tuning often introduces gradient descent to seek a good initialization of prompt and lacks versatility across tasks and PTMs. In this paper, we present BBTv2, an improved version of Black-Box Tuning, to drive PTMs for few-shot learning. We prepend continuous prompts to every layer of the PTM and propose a divide-and-conquer gradient-free algorithm to optimize the prompts at different layers alternately. Extensive experiments across various tasks and PTMs show that BBTv2 can achieve comparable performance to full model tuning and state-of-the-art parameter-efficient methods (e.g., Adapter, LoRA, BitFit, etc.) under few-shot settings while maintaining much fewer tunable parameters.

Comments:	Accepted to EMNLP 2022 (main conference). Code is available at this https URL
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.11200 [cs.CL]
	(or arXiv:2205.11200v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.11200

Submission history

From: Tianxiang Sun [view email]
[v1] Mon, 23 May 2022 11:10:19 UTC (653 KB)
[v2] Fri, 14 Oct 2022 15:02:05 UTC (809 KB)

Computer Science > Computation and Language

Title:BBTv2: Towards a Gradient-Free Future with Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:BBTv2: Towards a Gradient-Free Future with Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators