Decouple-Then-Merge: Towards Better Training for Diffusion Models

Ma, Qianli; Ning, Xuefei; Liu, Dongrui; Niu, Li; Zhang, Linfeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2410.06664 (cs)

[Submitted on 9 Oct 2024]

Title:Decouple-Then-Merge: Towards Better Training for Diffusion Models

Authors:Qianli Ma, Xuefei Ning, Dongrui Liu, Li Niu, Linfeng Zhang

View PDF HTML (experimental)

Abstract:Diffusion models are trained by learning a sequence of models that reverse each step of noise corruption. Typically, the model parameters are fully shared across multiple timesteps to enhance training efficiency. However, since the denoising tasks differ at each timestep, the gradients computed at different timesteps may conflict, potentially degrading the overall performance of image generation. To solve this issue, this work proposes a Decouple-then-Merge (DeMe) framework, which begins with a pretrained model and finetunes separate models tailored to specific timesteps. We introduce several improved techniques during the finetuning stage to promote effective knowledge sharing while minimizing training interference across timesteps. Finally, after finetuning, these separate models can be merged into a single model in the parameter space, ensuring efficient and practical inference. Experimental results show significant generation quality improvements upon 6 benchmarks including Stable Diffusion on COCO30K, ImageNet1K, PartiPrompts, and DDPM on LSUN Church, LSUN Bedroom, and CIFAR10.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.06664 [cs.CV]
	(or arXiv:2410.06664v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2410.06664

Submission history

From: Qianli Ma [view email]
[v1] Wed, 9 Oct 2024 08:19:25 UTC (10,867 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Decouple-Then-Merge: Towards Better Training for Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Decouple-Then-Merge: Towards Better Training for Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators