Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation

Klimaszewski, Mateusz; Andruszkiewicz, Piotr; Birch, Alexandra

Computer Science > Computation and Language

arXiv:2403.18804 (cs)

[Submitted on 27 Mar 2024]

Title:Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation

Authors:Mateusz Klimaszewski, Piotr Andruszkiewicz, Alexandra Birch

View PDF HTML (experimental)

Abstract:The rise of Modular Deep Learning showcases its potential in various Natural Language Processing applications. Parameter-efficient fine-tuning (PEFT) modularity has been shown to work for various use cases, from domain adaptation to multilingual setups. However, all this work covers the case where the modular components are trained and deployed within one single Pre-trained Language Model (PLM). This model-specific setup is a substantial limitation on the very modularity that modular architectures are trying to achieve. We ask whether current modular approaches are transferable between models and whether we can transfer the modules from more robust and larger PLMs to smaller ones. In this work, we aim to fill this gap via a lens of Knowledge Distillation, commonly used for model compression, and present an extremely straightforward approach to transferring pre-trained, task-specific PEFT modules between same-family PLMs. Moreover, we propose a method that allows the transfer of modules between incompatible PLMs without any change in the inference complexity. The experiments on Named Entity Recognition, Natural Language Inference, and Paraphrase Identification tasks over multiple languages and PEFT methods showcase the initial potential of transferable modularity.

Comments:	Accepted at LREC-COLING 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2403.18804 [cs.CL]
	(or arXiv:2403.18804v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.18804

Submission history

From: Mateusz Klimaszewski [view email]
[v1] Wed, 27 Mar 2024 17:50:00 UTC (169 KB)

Computer Science > Computation and Language

Title:Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators