Multilingual Machine Translation with Hyper-Adapters

Baziotis, Christos; Artetxe, Mikel; Cross, James; Bhosale, Shruti

Computer Science > Computation and Language

arXiv:2205.10835 (cs)

[Submitted on 22 May 2022 (v1), last revised 5 Dec 2022 (this version, v2)]

Title:Multilingual Machine Translation with Hyper-Adapters

Authors:Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

View PDF

Abstract:Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters -- hyper-networks that generate adapters from language and layer embeddings. While past work had poor results when scaling hyper-networks, we propose a rescaling fix that significantly improves convergence and enables training larger hyper-networks. We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters. When using the same number of parameters and FLOPS, our approach consistently outperforms regular adapters. Also, hyper-adapters converge faster than alternative approaches and scale better than regular dense networks. Our analysis shows that hyper-adapters learn to encode language relatedness, enabling positive transfer across languages.

Comments:	EMNLP 2022 camera-ready version. Code at this http URL under the "hyperadapters" branch (see instructions at this https URL)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.10835 [cs.CL]
	(or arXiv:2205.10835v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.10835

Submission history

From: Christos Baziotis [view email]
[v1] Sun, 22 May 2022 14:24:58 UTC (1,574 KB)
[v2] Mon, 5 Dec 2022 12:04:31 UTC (6,565 KB)

Computer Science > Computation and Language

Title:Multilingual Machine Translation with Hyper-Adapters

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multilingual Machine Translation with Hyper-Adapters

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators