Accelerating Image Generation with Sub-path Linear Approximation Model

Xu, Chen; Song, Tianhui; Feng, Weixin; Li, Xubin; Ge, Tiezheng; Zheng, Bo; Wang, Limin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.13903 (cs)

[Submitted on 22 Apr 2024 (v1), last revised 21 Jul 2024 (this version, v3)]

Title:Accelerating Image Generation with Sub-path Linear Approximation Model

Authors:Chen Xu, Tianhui Song, Weixin Feng, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang

View PDF HTML (experimental)

Abstract:Diffusion models have significantly advanced the state of the art in image, audio, and video generation tasks. However, their applications in practical scenarios are hindered by slow inference speed. Drawing inspiration from the approximation strategies utilized in consistency models, we propose the Sub-path Linear Approximation Model (SLAM), which accelerates diffusion models while maintaining high-quality image generation. SLAM treats the PF-ODE trajectory as a series of PF-ODE sub-paths divided by sampled points, and harnesses sub-path linear (SL) ODEs to form a progressive and continuous error estimation along each individual PF-ODE sub-path. The optimization on such SL-ODEs allows SLAM to construct denoising mappings with smaller cumulative approximated errors. An efficient distillation method is also developed to facilitate the incorporation of more advanced diffusion models, such as latent diffusion models. Our extensive experimental results demonstrate that SLAM achieves an efficient training regimen, requiring only 6 A100 GPU days to produce a high-quality generative model capable of 2 to 4-step generation with high performance. Comprehensive evaluations on LAION, MS COCO 2014, and MS COCO 2017 datasets also illustrate that SLAM surpasses existing acceleration methods in few-step generation tasks, achieving state-of-the-art performance both on FID and the quality of the generated images.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.13903 [cs.CV]
	(or arXiv:2404.13903v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.13903

Submission history

From: Tianhui Song [view email]
[v1] Mon, 22 Apr 2024 06:25:17 UTC (42,455 KB)
[v2] Tue, 23 Apr 2024 02:33:48 UTC (42,454 KB)
[v3] Sun, 21 Jul 2024 04:57:19 UTC (45,288 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Accelerating Image Generation with Sub-path Linear Approximation Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Accelerating Image Generation with Sub-path Linear Approximation Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators