Stable Flow: Vital Layers for Training-Free Image Editing

Avrahami, Omri; Patashnik, Or; Fried, Ohad; Nemchinov, Egor; Aberman, Kfir; Lischinski, Dani; Cohen-Or, Daniel

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.14430 (cs)

[Submitted on 21 Nov 2024]

Title:Stable Flow: Vital Layers for Training-Free Image Editing

Authors:Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or

View PDF HTML (experimental)

Abstract:Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT), and employed flow-matching for improved training and sampling. However, they exhibit limited generation diversity. In this work, we leverage this limitation to perform consistent image edits via selective injection of attention features. The main challenge is that, unlike the UNet-based models, DiT lacks a coarse-to-fine synthesis structure, making it unclear in which layers to perform the injection. Therefore, we propose an automatic method to identify "vital layers" within DiT, crucial for image formation, and demonstrate how these layers facilitate a range of controlled stable edits, from non-rigid modifications to object addition, using the same mechanism. Next, to enable real-image editing, we introduce an improved image inversion method for flow models. Finally, we evaluate our approach through qualitative and quantitative comparisons, along with a user study, and demonstrate its effectiveness across multiple applications. The project page is available at this https URL

Comments:	Project page is available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
Cite as:	arXiv:2411.14430 [cs.CV]
	(or arXiv:2411.14430v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.14430

Submission history

From: Omri Avrahami [view email]
[v1] Thu, 21 Nov 2024 18:59:51 UTC (47,913 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Stable Flow: Vital Layers for Training-Free Image Editing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Stable Flow: Vital Layers for Training-Free Image Editing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators