MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control

Yao, Yining; Guo, Xi; Ding, Chenjing; Wu, Wei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2409.06189 (cs)

[Submitted on 10 Sep 2024 (v1), last revised 11 Sep 2024 (this version, v2)]

Title:MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control

Authors:Yining Yao, Xi Guo, Chenjing Ding, Wei Wu

View PDF HTML (experimental)

Abstract:High-quality driving video generation is crucial for providing training data for autonomous driving models. However, current generative models rarely focus on enhancing camera motion control under multi-view tasks, which is essential for driving video generation. Therefore, we propose MyGo, an end-to-end framework for video generation, introducing motion of onboard cameras as conditions to make progress in camera controllability and multi-view consistency. MyGo employs additional plug-in modules to inject camera parameters into the pre-trained video diffusion model, which retains the extensive knowledge of the pre-trained model as much as possible. Furthermore, we use epipolar constraints and neighbor view information during the generation process of each view to enhance spatial-temporal consistency. Experimental results show that MyGo has achieved state-of-the-art results in both general camera-controlled video generation and multi-view driving video generation tasks, which lays the foundation for more accurate environment simulation in autonomous driving. Project page: this https URL

Comments:	Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2409.06189 [cs.CV]
	(or arXiv:2409.06189v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2409.06189

Submission history

From: Xi Guo [view email]
[v1] Tue, 10 Sep 2024 03:39:08 UTC (47,126 KB)
[v2] Wed, 11 Sep 2024 11:50:27 UTC (47,126 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators