Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation

Feng, Zhenyang; Wang, Zihe; Bueno, Saul Ibaven; Frelek, Tomasz; Ramesh, Advikaa; Bai, Jingyan; Wang, Lemeng; Huang, Zanming; Gu, Jianyang; Yoo, Jinsu; Pan, Tai-Yu; Chowdhury, Arpita; Ramirez, Michelle; Campolongo, Elizabeth G.; Thompson, Matthew J.; Lawrence, Christopher G.; Record, Sydne; Rosser, Neil; Karpatne, Anuj; Rubenstein, Daniel; Lapp, Hilmar; Stewart, Charles V.; Berger-Wolf, Tanya; Su, Yu; Chao, Wei-Lun

Abstract:We study image segmentation in the biological domain, particularly trait and part segmentation from specimen images (e.g., butterfly wing stripes or beetle body parts). This is a crucial, fine-grained task that aids in understanding the biology of organisms. The conventional approach involves hand-labeling masks, often for hundreds of images per species, and training a segmentation model to generalize these labels to other images, which can be exceedingly laborious. We present a label-efficient method named Static Segmentation by Tracking (SST). SST is built upon the insight: while specimens of the same species have inherent variations, the traits and parts we aim to segment show up consistently. This motivates us to concatenate specimen images into a ``pseudo-video'' and reframe trait and part segmentation as a tracking problem. Concretely, SST generates masks for unlabeled images by propagating annotated or predicted masks from the ``pseudo-preceding'' images. Powered by Segment Anything Model 2 (SAM~2) initially developed for video segmentation, we show that SST can achieve high-quality trait and part segmentation with merely one labeled image per species -- a breakthrough for analyzing specimen images. We further develop a cycle-consistent loss to fine-tune the model, again using one labeled image. Additionally, we highlight the broader potential of SST, including one-shot instance segmentation on images taken in the wild and trait-based image retrieval.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2501.06749 [cs.CV]
	(or arXiv:2501.06749v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.06749

Computer Science > Computer Vision and Pattern Recognition

Title:Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators