What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning

Bratulić, Jelena; Mittal, Sudhanshu; Rupprecht, Christian; Brox, Thomas

Computer Science > Computation and Language

arXiv:2501.06256 (cs)

[Submitted on 9 Jan 2025]

Title:What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning

Authors:Jelena Bratulić, Sudhanshu Mittal, Christian Rupprecht, Thomas Brox

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have demonstrated impressive performance in various tasks, including In-Context Learning (ICL), where the model performs new tasks by conditioning solely on the examples provided in the context, without updating the model's weights. While prior research has explored the roles of pretraining data and model architecture, the key mechanism behind ICL remains unclear. In this work, we systematically uncover properties present in LLMs that support the emergence of ICL. To disambiguate these factors, we conduct a study with a controlled dataset and data sequences using a deep autoregressive model. We show that conceptual repetitions in the data sequences are crucial for ICL, more so than previously indicated training data properties like burstiness or long-tail distribution. Conceptual repetitions could refer to $n$-gram repetitions in textual data or exact image copies in image sequence data. Such repetitions also offer other previously overlooked benefits such as reduced transiency in ICL performance. Furthermore, we show that the emergence of ICL depends on balancing the in-weight learning objective with the in-context solving ability during training.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2501.06256 [cs.CL]
	(or arXiv:2501.06256v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.06256

Submission history

From: Jelena Bratulić [view email]
[v1] Thu, 9 Jan 2025 09:45:05 UTC (1,554 KB)

Computer Science > Computation and Language

Title:What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:What Matters for In-Context Learning: A Balancing Act of Look-up and In-Weight Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators