Asterisk*: Keep it Simple

Semenov, Andrew

Computer Science > Computation and Language

arXiv:2411.05691 (cs)

[Submitted on 8 Nov 2024]

Title:Asterisk*: Keep it Simple

Authors:Andrew Semenov

View PDF HTML (experimental)

Abstract:This paper describes Asterisk, a compact GPT-based model for generating text embeddings. The model uses a minimalist architecture with two layers, two attention heads, and 256 embedding dimensions. By applying knowledge distillation from larger pretrained models, we explore the trade-offs between model size and performance while minimizing computational and memory requirements. The model is primarily evaluated and optimized for classification tasks, with experimental results showing its moderate performance in zero-shot classification across various downstream applications. With additional configuration, the model performance can approach or even surpass that of larger architectures on specific classification tasks.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2411.05691 [cs.CL]
	(or arXiv:2411.05691v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2411.05691

Submission history

From: Andrew S. [view email]
[v1] Fri, 8 Nov 2024 16:42:33 UTC (94 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-11

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Asterisk*: Keep it Simple

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Asterisk*: Keep it Simple

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators