FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images

Wang, Rong; Prada, Fabian; Wang, Ziyan; Jiang, Zhongshi; Yin, Chengxiang; Li, Junxuan; Saito, Shunsuke; Santesteban, Igor; Romero, Javier; Joshi, Rohan; Li, Hongdong; Saragih, Jason; Sheikh, Yaser

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.19207 (cs)

[Submitted on 24 Mar 2025 (v1), last revised 4 Apr 2025 (this version, v2)]

Title:FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images

Authors:Rong Wang, Fabian Prada, Ziyan Wang, Zhongshi Jiang, Chengxiang Yin, Junxuan Li, Shunsuke Saito, Igor Santesteban, Javier Romero, Rohan Joshi, Hongdong Li, Jason Saragih, Yaser Sheikh

View PDF HTML (experimental)

Abstract:We present a novel method for reconstructing personalized 3D human avatars with realistic animation from only a few images. Due to the large variations in body shapes, poses, and cloth types, existing methods mostly require hours of per-subject optimization during inference, which limits their practical applications. In contrast, we learn a universal prior from over a thousand clothed humans to achieve instant feedforward generation and zero-shot generalization. Specifically, instead of rigging the avatar with shared skinning weights, we jointly infer personalized avatar shape, skinning weights, and pose-dependent deformations, which effectively improves overall geometric fidelity and reduces deformation artifacts. Moreover, to normalize pose variations and resolve coupled ambiguity between canonical shapes and skinning weights, we design a 3D canonicalization process to produce pixel-aligned initial conditions, which helps to reconstruct fine-grained geometric details. We then propose a multi-frame feature aggregation to robustly reduce artifacts introduced in canonicalization and fuse a plausible avatar preserving person-specific identities. Finally, we train the model in an end-to-end framework on a large-scale capture dataset, which contains diverse human subjects paired with high-quality 3D scans. Extensive experiments show that our method generates more authentic reconstruction and animation than state-of-the-arts, and can be directly generalized to inputs from casually taken phone photos. Project page and code is available at this https URL.

Comments:	Published in CVPR 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.19207 [cs.CV]
	(or arXiv:2503.19207v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.19207

Submission history

From: Rong Wang [view email]
[v1] Mon, 24 Mar 2025 23:20:47 UTC (37,759 KB)
[v2] Fri, 4 Apr 2025 08:17:08 UTC (37,760 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators