Deep Learning Technique for Human Parsing: A Survey and Outlook

Yang, Lu; Jia, Wenhe; Li, Shan; Song, Qing

doi:10.1007/s11263-024-02031-9

Computer Science > Computer Vision and Pattern Recognition

arXiv:2301.00394 (cs)

[Submitted on 1 Jan 2023 (v1), last revised 14 Mar 2024 (this version, v2)]

Title:Deep Learning Technique for Human Parsing: A Survey and Outlook

Authors:Lu Yang, Wenhe Jia, Shan Li, Qing Song

View PDF HTML (experimental)

Abstract:Human parsing aims to partition humans in image or video into multiple pixel-level semantic parts. In the last decade, it has gained significantly increased interest in the computer vision community and has been utilized in a broad range of practical applications, from security monitoring, to social media, to visual special effects, just to name a few. Although deep learning-based human parsing solutions have made remarkable achievements, many important concepts, existing challenges, and potential research directions are still confusing. In this survey, we comprehensively review three core sub-tasks: single human parsing, multiple human parsing, and video human parsing, by introducing their respective task settings, background concepts, relevant problems and applications, representative literature, and datasets. We also present quantitative performance comparisons of the reviewed methods on benchmark datasets. Additionally, to promote sustainable development of the community, we put forward a transformer-based human parsing framework, providing a high-performance baseline for follow-up research through universal, concise, and extensible solutions. Finally, we point out a set of under-investigated open issues in this field and suggest new directions for future study. We also provide a regularly updated project page, to continuously track recent developments in this fast-advancing field: this https URL.

Comments:	Accepted for publication in International Journal of Computer Vision (IJCV)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2301.00394 [cs.CV]
	(or arXiv:2301.00394v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2301.00394
Related DOI:	https://doi.org/10.1007/s11263-024-02031-9

Submission history

From: Lu Yang [view email]
[v1] Sun, 1 Jan 2023 12:39:57 UTC (4,015 KB)
[v2] Thu, 14 Mar 2024 02:00:19 UTC (3,767 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Deep Learning Technique for Human Parsing: A Survey and Outlook

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Deep Learning Technique for Human Parsing: A Survey and Outlook

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators