Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

Smilga, Veronika

Computer Science > Computation and Language

arXiv:2501.06638 (cs)

[Submitted on 11 Jan 2025]

Title:Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

Authors:Veronika Smilga

View PDF HTML (experimental)

Abstract:Semantic leakage is a phenomenon recently introduced by Gonen et al. (2024). It refers to a situation in which associations learnt from the training data emerge in language model generations in an unexpected and sometimes undesired way. Prior work has focused on leakage in large language models (7B+ parameters). In this study, I use Qwen2.5 model family to explore whether smaller models, ranging from 500M to 7B parameters, demonstrate less semantic leakage due to their limited capacity for capturing complex associations. Building on the previous dataset from Gonen et al. (2024), I introduce a new dataset of color-focused prompts, categorized into specific types of semantic associations, to systematically evaluate the models' performance. Results indicate that smaller models exhibit less semantic leakage overall, although this trend is not strictly linear, with medium-sized models sometimes surpassing larger ones in leaking behavior. The dataset, the model generations, and the evaluation code are publicly available at this https URL.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2501.06638 [cs.CL]
	(or arXiv:2501.06638v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.06638

Submission history

From: Veronika Smilga [view email]
[v1] Sat, 11 Jan 2025 21:03:22 UTC (6,880 KB)

Computer Science > Computation and Language

Title:Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators