How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

Liu, Danni; Niehues, Jan

Computer Science > Computation and Language

arXiv:2309.08565 (cs)

[Submitted on 15 Sep 2023 (v1), last revised 24 Jan 2024 (this version, v3)]

Title:How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

Authors:Danni Liu, Jan Niehues

View PDF HTML (experimental)

Abstract:Customizing machine translation models to comply with desired attributes (e.g., formality or grammatical gender) is a well-studied topic. However, most current approaches rely on (semi-)supervised data with attribute annotations. This data scarcity bottlenecks democratizing such customization possibilities to a wider range of languages, particularly lower-resource ones. This gap is out of sync with recent progress in pretrained massively multilingual translation models. In response, we transfer the attribute controlling capabilities to languages without attribute-annotated data with an NLLB-200 model as a foundation. Inspired by techniques from controllable generation, we employ a gradient-based inference-time controller to steer the pretrained model. The controller transfers well to zero-shot conditions, as it operates on pretrained multilingual representations and is attribute -- rather than language-specific. With a comprehensive comparison to finetuning-based control, we demonstrate that, despite finetuning's clear dominance in supervised settings, the gap to inference-time control closes when moving to zero-shot conditions, especially with new and distant target languages. The latter also shows stronger domain robustness. We further show that our inference-time control complements finetuning. A human evaluation on a real low-resource language, Bengali, confirms our findings. Our code is this https URL

Comments:	EACL 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2309.08565 [cs.CL]
	(or arXiv:2309.08565v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.08565

Submission history

From: Danni Liu [view email]
[v1] Fri, 15 Sep 2023 17:33:24 UTC (7,937 KB)
[v2] Fri, 19 Jan 2024 16:48:59 UTC (8,263 KB)
[v3] Wed, 24 Jan 2024 17:48:31 UTC (8,263 KB)

Computer Science > Computation and Language

Title:How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:How Transferable are Attribute Controllers on Pretrained Multilingual Translation Models?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators