Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons

Jo, Hwiyeol; Choi, Stanley Jungkyu

Computer Science > Computation and Language

arXiv:1804.07946 (cs)

[Submitted on 21 Apr 2018 (v1), last revised 3 Jun 2018 (this version, v2)]

Title:Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons

Authors:Hwiyeol Jo, Stanley Jungkyu Choi

View PDF

Abstract:We propose post-processing method for enriching not only word representation but also its vector space using semantic lexicons, which we call extrofitting. The method consists of 3 steps as follows: (i) Expanding 1 or more dimension(s) on all the word vectors, filling with their representative value. (ii) Transferring semantic knowledge by averaging each representative values of synonyms and filling them in the expanded dimension(s). These two steps make representations of the synonyms close together. (iii) Projecting the vector space using Linear Discriminant Analysis, which eliminates the expanded dimension(s) with semantic knowledge. When experimenting with GloVe, we find that our method outperforms Faruqui's retrofitting on some of word similarity task. We also report further analysis on our method in respect to word vector dimensions, vocabulary size as well as other well-known pretrained word vectors (e.g., Word2Vec, Fasttext).

Comments:	In Proceedings of the 3rd ACL Workshop on Representation Learning for NLP (RepL4NLP)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1804.07946 [cs.CL]
	(or arXiv:1804.07946v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1804.07946

Submission history

From: Hwiyeol Jo [view email]
[v1] Sat, 21 Apr 2018 11:17:26 UTC (17 KB)
[v2] Sun, 3 Jun 2018 08:35:28 UTC (71 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2018-04

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hwiyeol Jo
Stanley Jungkyu Choi

export BibTeX citation

Computer Science > Computation and Language

Title:Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators