MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

Lin, Yukang; Fung, Hokit; Xu, Jianjin; Ren, Zeping; Lau, Adela S. M.; Yin, Guosheng; Li, Xiu

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.19383 (cs)

[Submitted on 25 Mar 2025]

Title:MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

Authors:Yukang Lin, Hokit Fung, Jianjin Xu, Zeping Ren, Adela S.M. Lau, Guosheng Yin, Xiu Li

View PDF HTML (experimental)

Abstract:Recent portrait animation methods have made significant strides in generating realistic lip synchronization. However, they often lack explicit control over head movements and facial expressions, and cannot produce videos from multiple viewpoints, resulting in less controllable and expressive animations. Moreover, text-guided portrait animation remains underexplored, despite its user-friendly nature. We present a novel two-stage text-guided framework, MVPortrait (Multi-view Vivid Portrait), to generate expressive multi-view portrait animations that faithfully capture the described motion and emotion. MVPortrait is the first to introduce FLAME as an intermediate representation, effectively embedding facial movements, expressions, and view transformations within its parameter space. In the first stage, we separately train the FLAME motion and emotion diffusion models based on text input. In the second stage, we train a multi-view video generation model conditioned on a reference portrait image and multi-view FLAME rendering sequences from the first stage. Experimental results exhibit that MVPortrait outperforms existing methods in terms of motion and emotion control, as well as view consistency. Furthermore, by leveraging FLAME as a bridge, MVPortrait becomes the first controllable portrait animation framework that is compatible with text, speech, and video as driving signals.

Comments:	CVPR 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.19383 [cs.CV]
	(or arXiv:2503.19383v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.19383

Submission history

From: Yukang Lin [view email]
[v1] Tue, 25 Mar 2025 06:24:37 UTC (5,187 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators