Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Nam, Hyelin; Kim, Jaemin; Lee, Dohun; Ye, Jong Chul

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.15540 (cs)

[Submitted on 23 Nov 2024]

Title:Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Authors:Hyelin Nam, Jaemin Kim, Dohun Lee, Jong Chul Ye

View PDF HTML (experimental)

Abstract:While text-to-video diffusion models have made significant strides, many still face challenges in generating videos with temporal consistency. Within diffusion frameworks, guidance techniques have proven effective in enhancing output quality during inference; however, applying these methods to video diffusion models introduces additional complexity of handling computations across entire sequences. To address this, we propose a novel framework called MotionPrompt that guides the video generation process via optical flow. Specifically, we train a discriminator to distinguish optical flow between random pairs of frames from real videos and generated ones. Given that prompts can influence the entire video, we optimize learnable token embeddings during reverse sampling steps by using gradients from a trained discriminator applied to random frame pairs. This approach allows our method to generate visually coherent video sequences that closely reflect natural motion dynamics, without compromising the fidelity of the generated content. We demonstrate the effectiveness of our approach across various models.

Comments:	project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:2411.15540 [cs.CV]
	(or arXiv:2411.15540v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.15540

Submission history

From: Jong Chul Ye [view email]
[v1] Sat, 23 Nov 2024 12:26:52 UTC (30,836 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators