DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Ram, Shwetha; Neiman, Tal; Feng, Qianli; Stuart, Andrew; Tran, Son; Chilimbi, Trishul

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.19390 (cs)

[Submitted on 28 Nov 2024]

Title:DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Authors:Shwetha Ram, Tal Neiman, Qianli Feng, Andrew Stuart, Son Tran, Trishul Chilimbi

View PDF HTML (experimental)

Abstract:Given a small number of images of a subject, personalized image generation techniques can fine-tune large pre-trained text-to-image diffusion models to generate images of the subject in novel contexts, conditioned on text prompts. In doing so, a trade-off is made between prompt fidelity, subject fidelity and diversity. As the pre-trained model is fine-tuned, earlier checkpoints synthesize images with low subject fidelity but high prompt fidelity and diversity. In contrast, later checkpoints generate images with low prompt fidelity and diversity but high subject fidelity. This inherent trade-off limits the prompt fidelity, subject fidelity and diversity of generated images. In this work, we propose DreamBlend to combine the prompt fidelity from earlier checkpoints and the subject fidelity from later checkpoints during inference. We perform a cross attention guided image synthesis from a later checkpoint, guided by an image generated by an earlier checkpoint, for the same prompt. This enables generation of images with better subject fidelity, prompt fidelity and diversity on challenging prompts, outperforming state-of-the-art fine-tuning methods.

Comments:	Accepted to WACV 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
Cite as:	arXiv:2411.19390 [cs.CV]
	(or arXiv:2411.19390v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.19390

Submission history

From: Shwetha Ram [view email]
[v1] Thu, 28 Nov 2024 21:49:31 UTC (33,784 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators