Manipulating Embeddings of Stable Diffusion Prompts

Deckers, Niklas; Peters, Julia; Potthast, Martin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.12059v1 (cs)

[Submitted on 23 Aug 2023 (this version), latest version 22 Jun 2024 (v2)]

Title:Manipulating Embeddings of Stable Diffusion Prompts

Authors:Niklas Deckers, Julia Peters, Martin Potthast

View PDF

Abstract:Generative text-to-image models such as Stable Diffusion allow users to generate images based on a textual description, the prompt. Changing the prompt is still the primary means for the user to change a generated image as desired. However, changing the image by reformulating the prompt remains a difficult process of trial and error, which has led to the emergence of prompt engineering as a new field of research. We propose and analyze methods to change the embedding of a prompt directly instead of the prompt text. It allows for more fine-grained and targeted control that takes into account user intentions. Our approach treats the generative text-to-image model as a continuous function and passes gradients between the image space and the prompt embedding space. By addressing different user interaction problems, we can apply this idea in three scenarios: (1) Optimization of a metric defined in image space that could measure, for example, image style. (2) Assistance of users in creative tasks by enabling them to navigate the image space along a selection of directions of "near" prompt embeddings. (3) Changing the embedding of the prompt to include information that the user has seen in a particular seed but finds difficult to describe in the prompt. Our experiments demonstrate the feasibility of the described methods.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2308.12059 [cs.CV]
	(or arXiv:2308.12059v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.12059

Submission history

From: Niklas Deckers [view email]
[v1] Wed, 23 Aug 2023 10:59:41 UTC (11,455 KB)
[v2] Sat, 22 Jun 2024 16:58:19 UTC (12,435 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Manipulating Embeddings of Stable Diffusion Prompts

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Manipulating Embeddings of Stable Diffusion Prompts

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators