When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Sikarwar, Ankur; Patel, Arkil; Goyal, Navin

Computer Science > Computation and Language

arXiv:2210.12786 (cs)

[Submitted on 23 Oct 2022 (v1), last revised 31 Oct 2022 (this version, v2)]

Title:When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Authors:Ankur Sikarwar, Arkil Patel, Navin Goyal

View PDF

Abstract:Humans can reason compositionally whilst grounding language utterances to the real world. Recent benchmarks like ReaSCAN use navigation tasks grounded in a grid world to assess whether neural models exhibit similar capabilities. In this work, we present a simple transformer-based model that outperforms specialized architectures on ReaSCAN and a modified version of gSCAN. On analyzing the task, we find that identifying the target location in the grid world is the main challenge for the models. Furthermore, we show that a particular split in ReaSCAN, which tests depth generalization, is unfair. On an amended version of this split, we show that transformers can generalize to deeper input structures. Finally, we design a simpler grounded compositional generalization task, RefEx, to investigate how transformers reason compositionally. We show that a single self-attention layer with a single head generalizes to novel combinations of object attributes. Moreover, we derive a precise mathematical construction of the transformer's computations from the learned network. Overall, we provide valuable insights about the grounded compositional generalization task and the behaviour of transformers on it, which would be useful for researchers working in this area.

Comments:	EMNLP 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.12786 [cs.CL]
	(or arXiv:2210.12786v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.12786

Submission history

From: Ankur Sikarwar [view email]
[v1] Sun, 23 Oct 2022 17:03:55 UTC (1,424 KB)
[v2] Mon, 31 Oct 2022 03:50:30 UTC (3,995 KB)

Computer Science > Computation and Language

Title:When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators