Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Li, Xuanlin; Trabucco, Brandon; Park, Dong Huk; Luo, Michael; Shen, Sheng; Darrell, Trevor; Gao, Yang

Computer Science > Computation and Language

arXiv:2110.15797 (cs)

[Submitted on 27 Oct 2021]

Title:Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Authors:Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao

View PDF

Abstract:The predominant approach for language modeling is to process sequences from left to right, but this eliminates a source of information: the order by which the sequence was generated. One strategy to recover this information is to decode both the content and ordering of tokens. Existing approaches supervise content and ordering by designing problem-specific loss functions and pre-training with an ordering pre-selected. Other recent works use iterative search to discover problem-specific orderings for training, but suffer from high time complexity and cannot be efficiently parallelized. We address these limitations with an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data -- no domain knowledge required. The learner contains an encoder network and decoder language model that perform variational inference with autoregressive orders (represented as permutation matrices) as latent variables. The corresponding ELBO is not differentiable, so we develop a practical algorithm for end-to-end optimization using policy gradients. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Permutations then serve as target generation orders for training an insertion-based Transformer language model. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.

Comments:	updated from ICLR 2021, first two authors contributed equally
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2110.15797 [cs.CL]
	(or arXiv:2110.15797v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2110.15797

Submission history

From: Brandon Trabucco [view email]
[v1] Wed, 27 Oct 2021 16:08:09 UTC (5,211 KB)

Computer Science > Computation and Language

Title:Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators