Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Jäckl, Bastian; Metz, Yannick; Schlegel, Udo; Keim, Daniel A.; Fischer, Maximilian T.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2412.15150 (cs)

[Submitted on 19 Dec 2024]

Title:Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Authors:Bastian Jäckl, Yannick Metz, Udo Schlegel, Daniel A. Keim, Maximilian T. Fischer

View PDF HTML (experimental)

Abstract:Object-centric architectures can learn to extract distinct object representations from visual scenes, enabling downstream applications on the object level. Similarly to autoencoder-based image models, object-centric approaches have been trained on the unsupervised reconstruction loss of images encoded by RGB color spaces. In our work, we challenge the common assumption that RGB images are the optimal color space for unsupervised learning in computer vision. We discuss conceptually and empirically that other color spaces, such as HSV, bear essential characteristics for object-centric representation learning, like robustness to lighting conditions. We further show that models improve when requiring them to predict additional color channels. Specifically, we propose to transform the predicted targets to the RGB-S space, which extends RGB with HSV's saturation component and leads to markedly better reconstruction and disentanglement for five common evaluation datasets. The use of composite color spaces can be implemented with basically no computational overhead, is agnostic of the models' architecture, and is universally applicable across a wide range of visual computing tasks and training types. The findings of our approach encourage additional investigations in computer vision tasks beyond object-centric learning.

Comments:	38 pages incl. references, 16 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
ACM classes:	I.4.8; I.2.10
Cite as:	arXiv:2412.15150 [cs.CV]
	(or arXiv:2412.15150v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2412.15150

Submission history

From: Bastian Jäckl [view email]
[v1] Thu, 19 Dec 2024 18:28:37 UTC (6,563 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators