CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Youwang, Kim; Ji-Yeon, Kim; Oh, Tae-Hyun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2206.04382 (cs)

[Submitted on 9 Jun 2022 (v1), last revised 21 Jul 2022 (this version, v2)]

Title:CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Authors:Kim Youwang, Kim Ji-Yeon, Tae-Hyun Oh

View PDF

Abstract:We propose CLIP-Actor, a text-driven motion recommendation and neural mesh stylization system for human mesh animation. CLIP-Actor animates a 3D human mesh to conform to a text prompt by recommending a motion sequence and optimizing mesh style attributes. We build a text-driven human motion recommendation system by leveraging a large-scale human motion dataset with language labels. Given a natural language prompt, CLIP-Actor suggests a text-conforming human motion in a coarse-to-fine manner. Then, our novel zero-shot neural style optimization detailizes and texturizes the recommended mesh sequence to conform to the prompt in a temporally-consistent and pose-agnostic manner. This is distinctive in that prior work fails to generate plausible results when the pose of an artist-designed mesh does not conform to the text from the beginning. We further propose the spatio-temporal view augmentation and mask-weighted embedding attention, which stabilize the optimization process by leveraging multi-frame human motion and rejecting poorly rendered views. We demonstrate that CLIP-Actor produces plausible and human-recognizable style 3D human mesh in motion with detailed geometry and texture solely from a natural language prompt.

Comments:	Accepted at ECCV 2022. [Project page] this https URL [Code] this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
Cite as:	arXiv:2206.04382 [cs.CV]
	(or arXiv:2206.04382v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.04382

Submission history

From: Youwang Kim [view email]
[v1] Thu, 9 Jun 2022 09:50:39 UTC (45,825 KB)
[v2] Thu, 21 Jul 2022 07:43:04 UTC (15,581 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators