Entity Tracking in Language Models

Kim, Najoung; Schuster, Sebastian

Computer Science > Computation and Language

arXiv:2305.02363 (cs)

[Submitted on 3 May 2023 (v1), last revised 8 Sep 2023 (this version, v2)]

Title:Entity Tracking in Language Models

Authors:Najoung Kim, Sebastian Schuster

View PDF

Abstract:Keeping track of how states of entities change as a text or dialog unfolds is a key prerequisite to discourse understanding. Yet, there have been few systematic investigations into the ability of large language models (LLMs) to track discourse entities. In this work, we present a task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. We use this task to first investigate whether Flan-T5, GPT-3 and GPT-3.5 can track the state of entities, and find that only GPT-3.5 models, which have been pretrained on large amounts of code, exhibit this ability. We then investigate whether smaller models pretrained primarily on text can learn to track entities, through finetuning T5 on several training/evaluation splits. While performance degrades for more complex splits, we find that even when evaluated on a different set of entities from training or longer operation sequences, a finetuned model can perform non-trivial entity tracking. Taken together, these results suggest that language models can learn to track entities but pretraining on text corpora alone does not make this capacity surface.

Comments:	ACL 2023 Camera-ready
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.02363 [cs.CL]
	(or arXiv:2305.02363v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.02363

Submission history

From: Najoung Kim [view email]
[v1] Wed, 3 May 2023 18:01:13 UTC (7,005 KB)
[v2] Fri, 8 Sep 2023 17:51:51 UTC (7,121 KB)

Computer Science > Computation and Language

Title:Entity Tracking in Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Entity Tracking in Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators