Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

He, Weizhen; Deng, Yiheng; Tang, Shixiang; Chen, Qihao; Xie, Qingsong; Wang, Yizhou; Bai, Lei; Zhu, Feng; Zhao, Rui; Ouyang, Wanli; Qi, Donglian; Yan, Yunfeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.07520 (cs)

[Submitted on 13 Jun 2023 (v1), last revised 31 Dec 2023 (this version, v4)]

Title:Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Authors:Weizhen He, Yiheng Deng, Shixiang Tang, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan

View PDF HTML (experimental)

Abstract:Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions. Our instruct-ReID is a more general ReID setting, where existing 6 ReID tasks can be viewed as special cases by designing different instructions. We propose a large-scale OmniReID benchmark and an adaptive triplet loss as a baseline method to facilitate research in this new setting. Experimental results show that the proposed multi-purpose ReID model, trained on our OmniReID benchmark without fine-tuning, can improve +0.5%, +0.6%, +7.7% mAP on Market1501, MSMT17, CUHK03 for traditional ReID, +6.4%, +7.1%, +11.2% mAP on PRCC, VC-Clothes, LTCC for clothes-changing ReID, +11.7% mAP on COCAS+ real2 for clothes template based clothes-changing ReID when using only RGB images, +24.9% mAP on COCAS+ real2 for our newly defined language-instructed ReID, +4.3% on LLCM for visible-infrared ReID, +2.6% on CUHK-PEDES for text-to-image ReID. The datasets, the model, and code will be available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.07520 [cs.CV]
	(or arXiv:2306.07520v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.07520

Submission history

From: Weizhen He [view email]
[v1] Tue, 13 Jun 2023 03:25:33 UTC (3,545 KB)
[v2] Tue, 4 Jul 2023 13:59:04 UTC (3,500 KB)
[v3] Fri, 7 Jul 2023 04:57:22 UTC (3,500 KB)
[v4] Sun, 31 Dec 2023 16:54:05 UTC (5,893 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators