Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

Humeau, Samuel; Shuster, Kurt; Lachaux, Marie-Anne; Weston, Jason

Computer Science > Computation and Language

arXiv:1905.01969v1 (cs)

[Submitted on 22 Apr 2019 (this version), latest version 25 Mar 2020 (v4)]

Title:Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

Authors:Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston

View PDF

Abstract:The use of deep pretrained bidirectional transformers has led to remarkable progress in learning multi-sentence representations for downstream language understanding tasks (Devlin et al., 2018). For tasks that make pairwise comparisons, e.g. matching a given context with a corresponding response, two approaches have permeated the literature. A Cross-encoder performs full self-attention over the pair; a Bi-encoder performs self-attention for each sequence separately, and the final representation is a function of the pair. While Cross-encoders nearly always outperform Bi-encoders on various tasks, both in our work and others' (Urbanek et al., 2019), they are orders of magnitude slower, which hampers their ability to perform real-time inference. In this work, we develop a new architecture, the Poly-encoder, that is designed to approach the performance of the Cross-encoder while maintaining reasonable computation time. Additionally, we explore two pretraining schemes with different datasets to determine how these affect the performance on our chosen dialogue tasks: ConvAI2 and DSTC7 Track 1. We show that our models achieve state-of-the-art results on both tasks; that the Poly-encoder is a suitable replacement for Bi-encoders and Cross-encoders; and that even better results can be obtained by pretraining on a large dialogue dataset.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1905.01969 [cs.CL]
	(or arXiv:1905.01969v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1905.01969

Submission history

From: Kurt Shuster [view email]
[v1] Mon, 22 Apr 2019 02:18:00 UTC (6,127 KB)
[v2] Mon, 19 Aug 2019 19:07:46 UTC (3,328 KB)
[v3] Wed, 12 Feb 2020 20:07:00 UTC (1,712 KB)
[v4] Wed, 25 Mar 2020 22:53:51 UTC (1,713 KB)

Computer Science > Computation and Language

Title:Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators