LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Jin, Er; Feng, Qihui; Mou, Yongli; Decker, Stefan; Lakemeyer, Gerhard; Simons, Oliver; Stegmaier, Johannes

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.01767 (cs)

[Submitted on 3 Jan 2025 (v1), last revised 8 Jan 2025 (this version, v2)]

Title:LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Authors:Er Jin, Qihui Feng, Yongli Mou, Stefan Decker, Gerhard Lakemeyer, Oliver Simons, Johannes Stegmaier

View PDF HTML (experimental)

Abstract:Logical image understanding involves interpreting and reasoning about the relationships and consistency within an image's visual content. This capability is essential in applications such as industrial inspection, where logical anomaly detection is critical for maintaining high-quality standards and minimizing costly recalls. Previous research in anomaly detection (AD) has relied on prior knowledge for designing algorithms, which often requires extensive manual annotations, significant computing power, and large amounts of data for training. Autoregressive, multimodal Vision Language Models (AVLMs) offer a promising alternative due to their exceptional performance in visual reasoning across various domains. Despite this, their application to logical AD remains unexplored. In this work, we investigate using AVLMs for logical AD and demonstrate that they are well-suited to the task. Combining AVLMs with format embedding and a logic reasoner, we achieve SOTA performance on public benchmarks, MVTec LOCO AD, with an AUROC of 86.0% and F1-max of 83.7%, along with explanations of anomalies. This significantly outperforms the existing SOTA method by a large margin.

Comments:	Accepted for publication at aaai25, project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.01767 [cs.CV]
	(or arXiv:2501.01767v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.01767

Submission history

From: Er Jin [view email]
[v1] Fri, 3 Jan 2025 11:40:41 UTC (3,474 KB)
[v2] Wed, 8 Jan 2025 12:11:18 UTC (3,474 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators