TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Morita, Ryugo; Frolov, Stanislav; Moser, Brian Bernhard; Shirakawa, Takahiro; Watanabe, Ko; Dengel, Andreas; Zhou, Jinjia

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.15580 (cs)

[Submitted on 23 Nov 2024]

Title:TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Authors:Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser, Takahiro Shirakawa, Ko Watanabe, Andreas Dengel, Jinjia Zhou

View PDF HTML (experimental)

Abstract:Diffusion models have enabled the generation of high-quality images with a strong focus on realism and textual fidelity. Yet, large-scale text-to-image models, such as Stable Diffusion, struggle to generate images where foreground objects are placed over a chroma key background, limiting their ability to separate foreground and background elements without fine-tuning. To address this limitation, we present a novel Training-Free Chroma Key Content Generation Diffusion Model (TKG-DM), which optimizes the initial random noise to produce images with foreground objects on a specifiable color background. Our proposed method is the first to explore the manipulation of the color aspects in initial noise for controlled background generation, enabling precise separation of foreground and background without fine-tuning. Extensive experiments demonstrate that our training-free method outperforms existing methods in both qualitative and quantitative evaluations, matching or surpassing fine-tuned models. Finally, we successfully extend it to other tasks (e.g., consistency models and text-to-video), highlighting its transformative potential across various generative applications where independent control of foreground and background is crucial.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2411.15580 [cs.CV]
	(or arXiv:2411.15580v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.15580

Submission history

From: Ryugo Morita [view email]
[v1] Sat, 23 Nov 2024 15:07:15 UTC (33,599 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators