An exactly solvable model for emergence and scaling laws

Nam, Yoonsoo; Fonseca, Nayara; Lee, Seok Hyeong; Mingard, Chris; Louis, Ard A.

Computer Science > Machine Learning

arXiv:2404.17563 (cs)

[Submitted on 26 Apr 2024 (v1), last revised 14 Jul 2024 (this version, v2)]

Title:An exactly solvable model for emergence and scaling laws

Authors:Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee, Chris Mingard, Ard A. Louis

View PDF HTML (experimental)

Abstract:Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute ($C$). We compare our detailed calculations to direct simulations of a two-layer neural network trained on multitask sparse parity, where the tasks in the dataset are distributed according to a power-law. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.

Subjects:	Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (stat.ML)
Cite as:	arXiv:2404.17563 [cs.LG]
	(or arXiv:2404.17563v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2404.17563

Submission history

From: Yoonsoo Nam [view email]
[v1] Fri, 26 Apr 2024 17:45:32 UTC (379 KB)
[v2] Sun, 14 Jul 2024 15:28:01 UTC (808 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2024-04

Change to browse by:

cond-mat
cond-mat.dis-nn
cs
stat
stat.ML

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:An exactly solvable model for emergence and scaling laws

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An exactly solvable model for emergence and scaling laws

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators