Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Abbe, Emmanuel; Bengio, Samy; Lotfi, Aryo; Rizk, Kevin

Computer Science > Machine Learning

arXiv:2301.13105 (cs)

[Submitted on 30 Jan 2023 (v1), last revised 20 Nov 2024 (this version, v3)]

Title:Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Authors:Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Kevin Rizk

View PDF HTML (experimental)

Abstract:This paper considers the learning of logical (Boolean) functions with a focus on the generalization on the unseen (GOTU) setting, a strong case of out-of-distribution generalization. This is motivated by the fact that the rich combinatorial nature of data in certain reasoning tasks (e.g., arithmetic/logic) makes representative data sampling challenging, and learning successfully under GOTU gives a first vignette of an 'extrapolating' or 'reasoning' learner. We study how different network architectures trained by (S)GD perform under GOTU and provide both theoretical and experimental evidence that for sparse functions and a class of network models including instances of Transformers, random features models, and linear networks, a min-degree-interpolator is learned on the unseen. More specifically, this means an interpolator of the training data that has minimal Fourier mass on the higher degree basis elements. These findings lead to two implications: (1) we provide an explanation to the length generalization problem for Boolean functions (e.g., Anil et al. 2022); (2) we introduce a curriculum learning algorithm called Degree-Curriculum that learns monomials more efficiently by incrementing supports. Finally, we discuss extensions to other models or non-sparse regimes where the min-degree bias may still occur or fade, as well as how it can be potentially corrected when undesirable.

Comments:	extended JMLR version of the original ICML 2023 paper
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2301.13105 [cs.LG]
	(or arXiv:2301.13105v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2301.13105

Submission history

From: Aryo Lotfi [view email]
[v1] Mon, 30 Jan 2023 17:44:05 UTC (767 KB)
[v2] Wed, 28 Jun 2023 15:41:49 UTC (1,012 KB)
[v3] Wed, 20 Nov 2024 17:16:01 UTC (494 KB)

Computer Science > Machine Learning

Title:Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generalization on the Unseen, Logic Reasoning and Degree Curriculum

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators