Upsampling DINOv2 features for unsupervised vision tasks and weakly supervised materials segmentation

Docherty, Ronan; Vamvakeros, Antonis; Cooper, Samuel J.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2410.19836 (cs)

[Submitted on 20 Oct 2024]

Title:Upsampling DINOv2 features for unsupervised vision tasks and weakly supervised materials segmentation

Authors:Ronan Docherty, Antonis Vamvakeros, Samuel J. Cooper

View PDF HTML (experimental)

Abstract:The features of self-supervised vision transformers (ViTs) contain strong semantic and positional information relevant to downstream tasks like object localization and segmentation. Recent works combine these features with traditional methods like clustering, graph partitioning or region correlations to achieve impressive baselines without finetuning or training additional networks. We leverage upsampled features from ViT networks (e.g DINOv2) in two workflows: in a clustering based approach for object localization and segmentation, and paired with standard classifiers in weakly supervised materials segmentation. Both show strong performance on benchmarks, especially in weakly supervised segmentation where the ViT features capture complex relationships inaccessible to classical approaches. We expect the flexibility and generalizability of these features will both speed up and strengthen materials characterization, from segmentation to property-prediction.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Materials Science (cond-mat.mtrl-sci); Image and Video Processing (eess.IV)
Cite as:	arXiv:2410.19836 [cs.CV]
	(or arXiv:2410.19836v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2410.19836

Submission history

From: Ronan Docherty [view email]
[v1] Sun, 20 Oct 2024 13:01:53 UTC (16,693 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2024-10

Change to browse by:

cond-mat
cond-mat.mtrl-sci
cs
eess
eess.IV

References & Citations

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Upsampling DINOv2 features for unsupervised vision tasks and weakly supervised materials segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Upsampling DINOv2 features for unsupervised vision tasks and weakly supervised materials segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators