2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Cao, Yunkang; Xu, Xiaohao; Sun, Chen; Cheng, Yuqi; Gao, Liang; Shen, Weiming

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.09067 (cs)

[Submitted on 15 Jun 2023 (v1), last revised 5 Sep 2023 (this version, v2)]

Title:2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Authors:Yunkang Cao, Xiaohao Xu, Chen Sun, Yuqi Cheng, Liang Gao, Weiming Shen

View PDF

Abstract:This technical report introduces the winning solution of the team Segment Any Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, e.g., language prompt, we present a novel framework, i.e., Segment Any Anomaly + (SAA$+$), for zero-shot anomaly segmentation with multi-modal prompts for the regularization of cascaded modern foundation models. Inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly (SAA) to leverage diverse multi-modal prior knowledge for anomaly localization. Subsequently, we further introduce multimodal prompts (SAA$+$) derived from domain expert knowledge and target image context to enable the non-parameter adaptation of foundation models to anomaly segmentation. The proposed SAA$+$ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will release the code of our winning solution for the CVPR2023 VAN.

Comments:	The first two author contribute equally. CVPR workshop challenge report. arXiv admin note: substantial text overlap with arXiv:2305.10724
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.09067 [cs.CV]
	(or arXiv:2306.09067v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.09067

Submission history

From: Yunkang Cao [view email]
[v1] Thu, 15 Jun 2023 11:49:44 UTC (4,330 KB)
[v2] Tue, 5 Sep 2023 14:44:04 UTC (4,330 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators