More Bang for Your Buck: Natural Perturbation for Robust Question Answering

Khashabi, Daniel; Khot, Tushar; Sabharwal, Ashish

Computer Science > Computation and Language

arXiv:2004.04849 (cs)

[Submitted on 9 Apr 2020 (v1), last revised 6 Oct 2020 (this version, v2)]

Title:More Bang for Your Buck: Natural Perturbation for Robust Question Answering

Authors:Daniel Khashabi, Tushar Khot, Ashish Sabharwal

View PDF

Abstract:While recent models have achieved human-level scores on many NLP datasets, we observe that they are considerably sensitive to small changes in input. As an alternative to the standard approach of addressing this issue by constructing training sets of completely new examples, we propose doing so via minimal perturbation of examples. Specifically, our approach involves first collecting a set of seed examples and then applying human-driven natural perturbations (as opposed to rule-based machine perturbations), which often change the gold label as well. Local perturbations have the advantage of being relatively easier (and hence cheaper) to create than writing out completely new examples. To evaluate the impact of this phenomenon, we consider a recent question-answering dataset (BoolQ) and study the benefit of our approach as a function of the perturbation cost ratio, the relative cost of perturbing an existing question vs. creating a new one from scratch. We find that when natural perturbations are moderately cheaper to create, it is more effective to train models using them: such models exhibit higher robustness and better generalization, while retaining performance on the original BoolQ dataset.

Comments:	EMNLP 2020
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2004.04849 [cs.CL]
	(or arXiv:2004.04849v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2004.04849

Submission history

From: Daniel Khashabi Mr. [view email]
[v1] Thu, 9 Apr 2020 23:12:39 UTC (3,689 KB)
[v2] Tue, 6 Oct 2020 07:10:00 UTC (3,391 KB)

Computer Science > Computation and Language

Title:More Bang for Your Buck: Natural Perturbation for Robust Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:More Bang for Your Buck: Natural Perturbation for Robust Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators