On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey

Guo, Jingcai; Rao, Zhijie; Chen, Zhi; Guo, Song; Zhou, Jingren; Tao, Dacheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2408.04879 (cs)

[Submitted on 9 Aug 2024 (v1), last revised 22 Aug 2024 (this version, v2)]

Title:On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey

Authors:Jingcai Guo, Zhijie Rao, Zhi Chen, Song Guo, Jingren Zhou, Dacheng Tao

View PDF HTML (experimental)

Abstract:Zero-shot image recognition (ZSIR) aims at empowering models to recognize and reason in unseen domains via learning generalized knowledge from limited data in the seen domain. The gist for ZSIR is to execute element-wise representation and reasoning from the input visual space to the target semantic space, which is a bottom-up modeling paradigm inspired by the process by which humans observe the world, i.e., capturing new concepts by learning and combining the basic components or shared characteristics. In recent years, element-wise learning techniques have seen significant progress in ZSIR as well as widespread application. However, to the best of our knowledge, there remains a lack of a systematic overview of this topic. To enrich the literature and provide a sound basis for its future development, this paper presents a broad review of recent advances in element-wise ZSIR. Concretely, we first attempt to integrate the three basic ZSIR tasks of object recognition, compositional recognition, and foundation model-based open-world recognition into a unified element-wise perspective and provide a detailed taxonomy and analysis of the main research approaches. Then, we collect and summarize some key information and benchmarks, such as detailed technical implementations and common datasets. Finally, we sketch out the wide range of its related applications, discuss vital challenges, and suggest potential future directions.

Comments:	23 pages, 7 figures, and 3 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2408.04879 [cs.CV]
	(or arXiv:2408.04879v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2408.04879

Submission history

From: Jingcai Guo [view email]
[v1] Fri, 9 Aug 2024 05:49:21 UTC (4,271 KB)
[v2] Thu, 22 Aug 2024 09:04:29 UTC (4,253 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic Survey

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators