Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

Yang, Ziyi; Zhu, Chenguang; Sachidananda, Vin; Darve, Eric

Computer Science > Computation and Language

arXiv:1906.03753 (cs)

[Submitted on 10 Jun 2019 (v1), last revised 5 Jun 2020 (this version, v2)]

Title:Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

Authors:Ziyi Yang, Chenguang Zhu, Vin Sachidananda, Eric Darve

View PDF

Abstract:Due to the ubiquitous use of embeddings as input representations for a wide range of natural language tasks, imputation of embeddings for rare and unseen words is a critical problem in language processing. Embedding imputation involves learning representations for rare or unseen words during the training of an embedding model, often in a post-hoc manner. In this paper, we propose an approach for embedding imputation which uses grounded information in the form of a knowledge graph. This is in contrast to existing approaches which typically make use of vector space properties or subword information. We propose an online method to construct a graph from grounded information and design an algorithm to map from the resulting graphical structure to the space of the pre-trained embeddings. Finally, we evaluate our approach on a range of rare and unseen word tasks across various domains and show that our model can learn better representations. For example, on the Card-660 task our method improves Pearson's and Spearman's correlation coefficients upon the state-of-the-art by 11% and 17.8% respectively using GloVe embeddings.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1906.03753 [cs.CL]
	(or arXiv:1906.03753v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1906.03753

Submission history

From: Ziyi Yang [view email]
[v1] Mon, 10 Jun 2019 01:10:34 UTC (30 KB)
[v2] Fri, 5 Jun 2020 21:42:06 UTC (33 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ziyi Yang
Chenguang Zhu
Vin Sachidananda
Eric Darve

export BibTeX citation

Computer Science > Computation and Language

Title:Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Out-of-Vocabulary Embedding Imputation with Grounded Language Information by Graph Convolutional Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators