Interpreting Embedding Spaces by Conceptualization

Simhi, Adi; Markovitch, Shaul

Computer Science > Computation and Language

arXiv:2209.00445 (cs)

[Submitted on 22 Aug 2022 (v1), last revised 9 Nov 2023 (this version, v3)]

Title:Interpreting Embedding Spaces by Conceptualization

Authors:Adi Simhi, Shaul Markovitch

View PDF

Abstract:One of the main methods for computational interpretation of a text is mapping it into a vector in some embedding space. Such vectors can then be used for a variety of textual processing tasks. Recently, most embedding spaces are a product of training large language models (LLMs). One major drawback of this type of representation is their incomprehensibility to humans. Understanding the embedding space is crucial for several important needs, including the need to debug the embedding method and compare it to alternatives, and the need to detect biases hidden in the model. In this paper, we present a novel method of understanding embeddings by transforming a latent embedding space into a comprehensible conceptual space. We present an algorithm for deriving a conceptual space with dynamic on-demand granularity. We devise a new evaluation method, using either human rater or LLM-based raters, to show that the conceptualized vectors indeed represent the semantics of the original latent ones. We show the use of our method for various tasks, including comparing the semantics of alternative models and tracing the layers of the LLM. The code is available online this https URL.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2209.00445 [cs.CL]
	(or arXiv:2209.00445v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2209.00445

Submission history

From: Adi Simhi [view email]
[v1] Mon, 22 Aug 2022 15:32:17 UTC (344 KB)
[v2] Sun, 19 Feb 2023 13:06:00 UTC (889 KB)
[v3] Thu, 9 Nov 2023 13:42:37 UTC (873 KB)

Computer Science > Computation and Language

Title:Interpreting Embedding Spaces by Conceptualization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Interpreting Embedding Spaces by Conceptualization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators