Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Chen, Boyuan; Monso, Diego Marti; Du, Yilun; Simchowitz, Max; Tedrake, Russ; Sitzmann, Vincent

Computer Science > Machine Learning

arXiv:2407.01392 (cs)

[Submitted on 1 Jul 2024 (v1), last revised 4 Jul 2024 (this version, v3)]

Title:Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Authors:Boyuan Chen, Diego Marti Monso, Yilun Du, Max Simchowitz, Russ Tedrake, Vincent Sitzmann

View PDF

Abstract:This paper presents Diffusion Forcing, a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels. We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens without fully diffusing past ones. Our approach is shown to combine the strengths of next-token prediction models, such as variable-length generation, with the strengths of full-sequence diffusion models, such as the ability to guide sampling to desirable trajectories. Our method offers a range of additional capabilities, such as (1) rolling-out sequences of continuous tokens, such as video, with lengths past the training horizon, where baselines diverge and (2) new sampling and guiding schemes that uniquely profit from Diffusion Forcing's variable-horizon and causal architecture, and which lead to marked performance gains in decision-making and planning tasks. In addition to its empirical success, our method is proven to optimize a variational lower bound on the likelihoods of all subsequences of tokens drawn from the true joint distribution. Project website: this https URL

Comments:	Project website: this https URL Code: this https URL
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2407.01392 [cs.LG]
	(or arXiv:2407.01392v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.01392

Submission history

From: Boyuan Chen [view email]
[v1] Mon, 1 Jul 2024 15:43:25 UTC (22,035 KB)
[v2] Tue, 2 Jul 2024 15:39:29 UTC (22,035 KB)
[v3] Thu, 4 Jul 2024 04:51:10 UTC (22,036 KB)

Computer Science > Machine Learning

Title:Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators