OpenSDI: Spotting Diffusion-Generated Images in the Open World

Wang, Yabin; Huang, Zhiwu; Hong, Xiaopeng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.19653 (cs)

[Submitted on 25 Mar 2025 (v1), last revised 30 Mar 2025 (this version, v2)]

Title:OpenSDI: Spotting Diffusion-Generated Images in the Open World

Authors:Yabin Wang, Zhiwu Huang, Xiaopeng Hong

View PDF HTML (experimental)

Abstract:This paper identifies OpenSDI, a challenge for spotting diffusion-generated images in open-world settings. In response to this challenge, we define a new benchmark, the OpenSDI dataset (OpenSDID), which stands out from existing datasets due to its diverse use of large vision-language models that simulate open-world diffusion-based manipulations. Another outstanding feature of OpenSDID is its inclusion of both detection and localization tasks for images manipulated globally and locally by diffusion models. To address the OpenSDI challenge, we propose a Synergizing Pretrained Models (SPM) scheme to build up a mixture of foundation models. This approach exploits a collaboration mechanism with multiple pretrained foundation models to enhance generalization in the OpenSDI context, moving beyond traditional training by synergizing multiple pretrained models through prompting and attending strategies. Building on this scheme, we introduce MaskCLIP, an SPM-based model that aligns Contrastive Language-Image Pre-Training (CLIP) with Masked Autoencoder (MAE). Extensive evaluations on OpenSDID show that MaskCLIP significantly outperforms current state-of-the-art methods for the OpenSDI challenge, achieving remarkable relative improvements of 14.23% in IoU (14.11% in F1) and 2.05% in accuracy (2.38% in F1) compared to the second-best model in localization and detection tasks, respectively. Our dataset and code are available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.19653 [cs.CV]
	(or arXiv:2503.19653v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.19653

Submission history

From: Yabin Wang [view email]
[v1] Tue, 25 Mar 2025 13:43:16 UTC (23,854 KB)
[v2] Sun, 30 Mar 2025 11:48:54 UTC (23,852 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:OpenSDI: Spotting Diffusion-Generated Images in the Open World

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:OpenSDI: Spotting Diffusion-Generated Images in the Open World

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators