Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation

Qian, Haotian; Chen, YD; Lou, Shengtao; Khan, Fahad Shahbaz; Jin, Xiaogang; Fan, Deng-Ping

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.19080 (cs)

[Submitted on 26 Dec 2024]

Title:Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation

Authors:Haotian Qian, YD Chen, Shengtao Lou, Fahad Shahbaz Khan, Xiaogang Jin, Deng-Ping Fan

View PDF HTML (experimental)

Abstract:Dichotomous Image Segmentation (DIS) tasks require highly precise annotations, and traditional dataset creation methods are labor intensive, costly, and require extensive domain expertise. Although using synthetic data for DIS is a promising solution to these challenges, current generative models and techniques struggle with the issues of scene deviations, noise-induced errors, and limited training sample variability. To address these issues, we introduce a novel approach, \textbf{\ourmodel{}}, which provides a scalable solution for generating diverse and precise datasets, markedly reducing preparation time and costs. We first introduce a general mask editing method that combines rigid and non-rigid editing techniques to generate high-quality synthetic masks. Specially, rigid editing leverages geometric priors from diffusion models to achieve precise viewpoint transformations under zero-shot conditions, while non-rigid editing employs adversarial training and self-attention mechanisms for complex, topologically consistent modifications. Then, we generate pairs of high-resolution image and accurate segmentation mask using a multi-conditional control generation method. Finally, our experiments on the widely-used DIS5K dataset benchmark demonstrate superior performance in quality and efficiency compared to existing methods. The code is available at \url{this https URL}.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.19080 [cs.CV]
	(or arXiv:2412.19080v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.19080

Submission history

From: Yinda Chen [view email]
[v1] Thu, 26 Dec 2024 06:37:25 UTC (14,882 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators