Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Shen, Yikang; Tan, Shawn; Sordoni, Alessandro; Reddy, Siva; Courville, Aaron

Computer Science > Computation and Language

arXiv:2011.07960 (cs)

[Submitted on 21 Oct 2020 (v1), last revised 10 May 2021 (this version, v2)]

Title:Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Authors:Yikang Shen, Shawn Tan, Alessandro Sordoni, Siva Reddy, Aaron Courville

View PDF

Abstract:Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization problems and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with an incremental parser and maintains the conditional probability setting of a standard language model (left-to-right). To train the incremental parser and avoid exposure bias, we also propose a novel dynamic oracle, so that SOM is more robust to wrong parsing decisions. Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests, while using fewer parameters than other models.

Comments:	12 pages, 10 figures
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2011.07960 [cs.CL]
	(or arXiv:2011.07960v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2011.07960
Journal reference:	NAACL 2021

Submission history

From: Yikang Shen [view email]
[v1] Wed, 21 Oct 2020 17:39:15 UTC (7,991 KB)
[v2] Mon, 10 May 2021 18:13:41 UTC (647 KB)

Computer Science > Computation and Language

Title:Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators