Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Nagarajan, Vaishnavh; Wu, Chen Henry; Ding, Charles; Raghunathan, Aditi

Computer Science > Machine Learning

arXiv:2504.15266 (cs)

[Submitted on 21 Apr 2025]

Title:Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Authors:Vaishnavh Nagarajan, Chen Henry Wu, Charles Ding, Aditi Raghunathan

View PDF HTML (experimental)

Abstract:We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model. Much like real-world tasks that require a creative, far-sighted leap of thought, our tasks require an implicit, open-ended stochastic planning step that either (a) discovers new connections in an abstract knowledge graph (like in wordplay, drawing analogies, or research) or (b) constructs new patterns (like in designing math problems or new proteins). In these tasks, we empirically and conceptually argue how next-token learning is myopic and memorizes excessively; comparatively, multi-token approaches, namely teacherless training and diffusion models, excel in producing diverse and original output. Secondly, in our tasks, we find that to elicit randomness from the Transformer without hurting coherence, it is better to inject noise right at the input layer (via a method we dub hash-conditioning) rather than defer to temperature sampling from the output layer. Thus, our work offers a principled, minimal test-bed for analyzing open-ended creative skills, and offers new arguments for going beyond next-token learning and softmax-based sampling. We make part of the code available under this https URL

Comments:	37 pages
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2504.15266 [cs.LG]
	(or arXiv:2504.15266v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2504.15266

Submission history

From: Chen Henry Wu [view email]
[v1] Mon, 21 Apr 2025 17:47:46 UTC (1,024 KB)

Computer Science > Machine Learning

Title:Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators