Word sense induction using word embeddings and community detection in complex networks

Corrêa Jr., Edilson A.; Amancio, Diego R.

doi:10.1016/j.physa.2019.02.032

Computer Science > Computation and Language

arXiv:1803.08476 (cs)

[Submitted on 22 Mar 2018]

Title:Word sense induction using word embeddings and community detection in complex networks

Authors:Edilson A. Corrêa Jr., Diego R. Amancio

View PDF

Abstract:Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domain-specific knowledge sources. In this paper, we devise a method that leverages recent findings in word embeddings research to generate context embeddings, which are embeddings containing information about the semantical context of a word. In order to induce senses, we modeled the set of ambiguous words as a complex network. In the generated network, two instances (nodes) are connected if the respective context embeddings are similar. Upon using well-established community detection methods to cluster the obtained context embeddings, we found that the proposed method yields excellent performance for the WSI task. Our method outperformed competing algorithms and baselines, in a completely unsupervised manner and without the need of any additional structured knowledge source.

Subjects:	Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Cite as:	arXiv:1803.08476 [cs.CL]
	(or arXiv:1803.08476v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1803.08476
Journal reference:	Physica A, v. 523, p. 180-190, 2019
Related DOI:	https://doi.org/10.1016/j.physa.2019.02.032

Submission history

From: Diego Amancio Dr. [view email]
[v1] Thu, 22 Mar 2018 17:22:42 UTC (1,180 KB)

Computer Science > Computation and Language

Title:Word sense induction using word embeddings and community detection in complex networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Word sense induction using word embeddings and community detection in complex networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators