Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Liu, Tianqi; Huang, Zihao; Chen, Zhaoxi; Wang, Guangcong; Hu, Shoukang; Shen, Liao; Sun, Huiqiang; Cao, Zhiguo; Li, Wei; Liu, Ziwei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.20785 (cs)

[Submitted on 26 Mar 2025]

Title:Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Authors:Tianqi Liu, Zihao Huang, Zhaoxi Chen, Guangcong Wang, Shoukang Hu, Liao Shen, Huiqiang Sun, Zhiguo Cao, Wei Li, Ziwei Liu

View PDF HTML (experimental)

Abstract:We present Free4D, a novel tuning-free framework for 4D scene generation from a single image. Existing methods either focus on object-level generation, making scene-level generation infeasible, or rely on large-scale multi-view video datasets for expensive training, with limited generalization ability due to the scarcity of 4D scene data. In contrast, our key insight is to distill pre-trained foundation models for consistent 4D scene representation, which offers promising advantages such as efficiency and generalizability. 1) To achieve this, we first animate the input image using image-to-video diffusion models followed by 4D geometric structure initialization. 2) To turn this coarse structure into spatial-temporal consistent multiview videos, we design an adaptive guidance mechanism with a point-guided denoising strategy for spatial consistency and a novel latent replacement strategy for temporal coherence. 3) To lift these generated observations into consistent 4D representation, we propose a modulation-based refinement to mitigate inconsistencies while fully leveraging the generated information. The resulting 4D representation enables real-time, controllable rendering, marking a significant advancement in single-image-based 4D scene generation.

Comments:	Project Page: this https URL , Code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.20785 [cs.CV]
	(or arXiv:2503.20785v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.20785

Submission history

From: Tianqi Liu [view email]
[v1] Wed, 26 Mar 2025 17:59:44 UTC (5,451 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators