Lifting Motion to the 3D World via 2D Diffusion

Li, Jiaman; Liu, C. Karen; Wu, Jiajun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.18808 (cs)

[Submitted on 27 Nov 2024]

Title:Lifting Motion to the 3D World via 2D Diffusion

Authors:Jiaman Li, C. Karen Liu, Jiajun Wu

View PDF HTML (experimental)

Abstract:Estimating 3D motion from 2D observations is a long-standing research challenge. Prior work typically requires training on datasets containing ground truth 3D motions, limiting their applicability to activities well-represented in existing motion capture data. This dependency particularly hinders generalization to out-of-distribution scenarios or subjects where collecting 3D ground truth is challenging, such as complex athletic movements or animal motion. We introduce MVLift, a novel approach to predict global 3D motion -- including both joint rotations and root trajectories in the world coordinate system -- using only 2D pose sequences for training. Our multi-stage framework leverages 2D motion diffusion models to progressively generate consistent 2D pose sequences across multiple views, a key step in recovering accurate global 3D motion. MVLift generalizes across various domains, including human poses, human-object interactions, and animal poses. Despite not requiring 3D supervision, it outperforms prior work on five datasets, including those methods that require 3D supervision.

Comments:	project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2411.18808 [cs.CV]
	(or arXiv:2411.18808v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.18808

Submission history

From: Jiaman Li [view email]
[v1] Wed, 27 Nov 2024 23:26:56 UTC (7,781 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Lifting Motion to the 3D World via 2D Diffusion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Lifting Motion to the 3D World via 2D Diffusion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators