Continual Training of Language Models for Few-Shot Learning

Ke, Zixuan; Lin, Haowei; Shao, Yijia; Xu, Hu; Shu, Lei; Liu, Bing

Computer Science > Computation and Language

arXiv:2210.05549 (cs)

[Submitted on 11 Oct 2022]

Title:Continual Training of Language Models for Few-Shot Learning

Authors:Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, Bing Liu

View PDF

Abstract:Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications. Adapting or posttraining an LM using an unlabeled domain corpus can produce even better performance for end-tasks in the domain. This paper proposes the problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills. The goal is to improve the few-shot end-task learning in these domains. The resulting system is called CPT (Continual PostTraining), which to our knowledge, is the first continual post-training system. Experimental results verify its effectiveness.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:2210.05549 [cs.CL]
	(or arXiv:2210.05549v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.05549
Journal reference:	EMNLP 2022

Submission history

From: Zixuan Ke [view email]
[v1] Tue, 11 Oct 2022 15:43:58 UTC (763 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2022-10

Change to browse by:

cs
cs.AI
cs.LG
cs.NE

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Continual Training of Language Models for Few-Shot Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Continual Training of Language Models for Few-Shot Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators