RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Li, Siqi; Jiang, Zhengkai; Zhou, Jiawei; Liu, Zhihong; Chi, Xiaowei; Wang, Haoqian

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.08682 (cs)

[Submitted on 15 Jan 2025]

Title:RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Authors:Siqi Li, Zhengkai Jiang, Jiawei Zhou, Zhihong Liu, Xiaowei Chi, Haoqian Wang

View PDF HTML (experimental)

Abstract:Virtual try-on has emerged as a pivotal task at the intersection of computer vision and fashion, aimed at digitally simulating how clothing items fit on the human body. Despite notable progress in single-image virtual try-on (VTO), current methodologies often struggle to preserve a consistent and authentic appearance of clothing across extended video sequences. This challenge arises from the complexities of capturing dynamic human pose and maintaining target clothing characteristics. We leverage pre-existing video foundation models to introduce RealVVT, a photoRealistic Video Virtual Try-on framework tailored to bolster stability and realism within dynamic video contexts. Our methodology encompasses a Clothing & Temporal Consistency strategy, an Agnostic-guided Attention Focus Loss mechanism to ensure spatial consistency, and a Pose-guided Long Video VTO technique adept at handling extended video this http URL experiments across various datasets confirms that our approach outperforms existing state-of-the-art models in both single-image and video VTO tasks, offering a viable solution for practical applications within the realms of fashion e-commerce and virtual fitting environments.

Comments:	10 pages (8 pages main text, 2 pages references), 5 figures in the main text, and 4 pages supplementary materials with 3 additional figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
MSC classes:	68T99
Cite as:	arXiv:2501.08682 [cs.CV]
	(or arXiv:2501.08682v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.08682

Submission history

From: Siqi Li [view email]
[v1] Wed, 15 Jan 2025 09:22:38 UTC (6,817 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators