K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

Mudrakarta, Pramod Kaushik; Sandler, Mark; Zhmoginov, Andrey; Howard, Andrew

Computer Science > Machine Learning

arXiv:1810.10703 (cs)

[Submitted on 25 Oct 2018 (v1), last revised 24 Feb 2019 (this version, v2)]

Title:K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

Authors:Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, Andrew Howard

View PDF

Abstract:We introduce a novel method that enables parameter-efficient transfer and multi-task learning with deep neural networks. The basic approach is to learn a model patch - a small set of parameters - that will specialize to each task, instead of fine-tuning the last layer or the entire network. For instance, we show that learning a set of scales and biases is sufficient to convert a pretrained network to perform well on qualitatively different problems (e.g. converting a Single Shot MultiBox Detection (SSD) model into a 1000-class image classification model while reusing 98% of parameters of the SSD feature extractor). Similarly, we show that re-learning existing low-parameter layers (such as depth-wise convolutions) while keeping the rest of the network frozen also improves transfer-learning accuracy significantly. Our approach allows both simultaneous (multi-task) as well as sequential transfer learning. In several multi-task learning problems, despite using much fewer parameters than traditional logits-only fine-tuning, we match single-task performance.

Comments:	published at ICLR 2019
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1810.10703 [cs.LG]
	(or arXiv:1810.10703v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1810.10703

Submission history

From: Pramod Kaushik Mudrakarta [view email]
[v1] Thu, 25 Oct 2018 03:12:37 UTC (937 KB)
[v2] Sun, 24 Feb 2019 02:03:00 UTC (5,262 KB)

Computer Science > Machine Learning

Title:K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators