Attention-guided Generative Models for Extractive Question Answering

Xu, Peng; Liang, Davis; Huang, Zhiheng; Xiang, Bing

Computer Science > Computation and Language

arXiv:2110.06393 (cs)

[Submitted on 12 Oct 2021]

Title:Attention-guided Generative Models for Extractive Question Answering

Authors:Peng Xu, Davis Liang, Zhiheng Huang, Bing Xiang

View PDF

Abstract:We propose a novel method for applying Transformer models to extractive question answering (QA) tasks. Recently, pretrained generative sequence-to-sequence (seq2seq) models have achieved great success in question answering. Contributing to the success of these models are internal attention mechanisms such as cross-attention. We propose a simple strategy to obtain an extractive answer span from the generative model by leveraging the decoder cross-attention patterns. Viewing cross-attention as an architectural prior, we apply joint training to further improve QA performance. Empirical results show that on open-domain question answering datasets like NaturalQuestions and TriviaQA, our method approaches state-of-the-art performance on both generative and extractive inference, all while using much fewer parameters. Furthermore, this strategy allows us to perform hallucination-free inference while conferring significant improvements to the model's ability to rerank relevant passages.

Comments:	10 pages
Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as:	arXiv:2110.06393 [cs.CL]
	(or arXiv:2110.06393v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2110.06393

Submission history

From: Peng Xu [view email]
[v1] Tue, 12 Oct 2021 23:02:35 UTC (5,574 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-10

Change to browse by:

cs
cs.IR

References & Citations

DBLP - CS Bibliography

listing | bibtex

Peng Xu
Davis Liang
Zhiheng Huang
Bing Xiang

export BibTeX citation

Computer Science > Computation and Language

Title:Attention-guided Generative Models for Extractive Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Attention-guided Generative Models for Extractive Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators