Bio-inspired Structure Identification in Language Embeddings

Hongwei; Zhou; Elek, Oskar; Anand, Pranav; Forbes, Angus G.

Computer Science > Computation and Language

arXiv:2009.02459 (cs)

[Submitted on 5 Sep 2020 (v1), last revised 15 Sep 2020 (this version, v2)]

Title:Bio-inspired Structure Identification in Language Embeddings

Authors:Hongwei (Henry)Zhou, Oskar Elek, Pranav Anand, Angus G. Forbes

View PDF

Abstract:Word embeddings are a popular way to improve downstream performances in contemporary language modeling. However, the underlying geometric structure of the embedding space is not well understood. We present a series of explorations using bio-inspired methodology to traverse and visualize word embeddings, demonstrating evidence of discernible structure. Moreover, our model also produces word similarity rankings that are plausible yet very different from common similarity metrics, mainly cosine similarity and Euclidean distance. We show that our bio-inspired model can be used to investigate how different word embedding techniques result in different semantic outputs, which can emphasize or obscure particular interpretations in textual data.

Comments:	7 pages, 8 figures, 2 tables, Visualisation for the Digital Humanities 2020. Comments: Fixed white spaces in abstract
Subjects:	Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2009.02459 [cs.CL]
	(or arXiv:2009.02459v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2009.02459

Submission history

From: Hongwei Zhou [view email]
[v1] Sat, 5 Sep 2020 04:44:15 UTC (38,386 KB)
[v2] Tue, 15 Sep 2020 23:59:06 UTC (38,386 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.HC

< prev | next >

new | recent | 2020-09

Change to browse by:

cs
cs.CL

References & Citations

DBLP - CS Bibliography

listing | bibtex

Oskar Elek
Pranav Anand
Angus G. Forbes

export BibTeX citation

Computer Science > Computation and Language

Title:Bio-inspired Structure Identification in Language Embeddings

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Bio-inspired Structure Identification in Language Embeddings

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators