Unifying Autoregressive and Diffusion-Based Sequence Generation

Fathi, Nima; Scholak, Torsten; Noël, Pierre-André

Computer Science > Machine Learning

arXiv:2504.06416 (cs)

[Submitted on 8 Apr 2025]

Title:Unifying Autoregressive and Diffusion-Based Sequence Generation

Authors:Nima Fathi, Torsten Scholak, Pierre-André Noël

View PDF HTML (experimental)

Abstract:We present significant extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. We introduce hyperschedules, which assign distinct noise schedules to individual token positions, generalizing both autoregressive models (e.g., GPT) and conventional diffusion models (e.g., SEDD, MDLM) as special cases. Second, we propose two hybrid token-wise noising processes that interpolate between absorbing and uniform processes, enabling the model to fix past mistakes, and we introduce a novel inference algorithm that leverages this new feature in a simplified context inspired from MDLM. To support efficient training and inference, we design attention masks compatible with KV-caching. Our methods achieve state-of-the-art perplexity and generate diverse, high-quality sequences across standard benchmarks, suggesting a promising path for autoregressive diffusion-based sequence generation.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2504.06416 [cs.LG]
	(or arXiv:2504.06416v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2504.06416

Submission history

From: Pierre-André Noël [view email]
[v1] Tue, 8 Apr 2025 20:32:10 UTC (764 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2025-04

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Unifying Autoregressive and Diffusion-Based Sequence Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Unifying Autoregressive and Diffusion-Based Sequence Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators