Editable Image Elements for Controllable Synthesis

Mu, Jiteng; Gharbi, Michaël; Zhang, Richard; Shechtman, Eli; Vasconcelos, Nuno; Wang, Xiaolong; Park, Taesung

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.16029 (cs)

[Submitted on 24 Apr 2024]

Title:Editable Image Elements for Controllable Synthesis

Authors:Jiteng Mu, Michaël Gharbi, Richard Zhang, Eli Shechtman, Nuno Vasconcelos, Xiaolong Wang, Taesung Park

View PDF HTML (experimental)

Abstract:Diffusion models have made significant advances in text-guided synthesis tasks. However, editing user-provided images remains challenging, as the high dimensional noise input space of diffusion models is not naturally suited for image inversion or spatial editing. In this work, we propose an image representation that promotes spatial editing of input images using a diffusion model. Concretely, we learn to encode an input into "image elements" that can faithfully reconstruct an input image. These elements can be intuitively edited by a user, and are decoded by a diffusion model into realistic images. We show the effectiveness of our representation on various image editing tasks, such as object resizing, rearrangement, dragging, de-occlusion, removal, variation, and image composition. Project page: this https URL

Comments:	Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.16029 [cs.CV]
	(or arXiv:2404.16029v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.16029

Submission history

From: Jiteng Mu [view email]
[v1] Wed, 24 Apr 2024 17:59:11 UTC (39,490 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2024-04

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Editable Image Elements for Controllable Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Editable Image Elements for Controllable Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators