Describing Sets of Images with Textual-PCA

Hupert, Oded; Schwartz, Idan; Wolf, Lior

Computer Science > Computer Vision and Pattern Recognition

arXiv:2210.12112 (cs)

[Submitted on 21 Oct 2022]

Title:Describing Sets of Images with Textual-PCA

Authors:Oded Hupert, Idan Schwartz, Lior Wolf

View PDF

Abstract:We seek to semantically describe a set of images, capturing both the attributes of single images and the variations within the set. Our procedure is analogous to Principle Component Analysis, in which the role of projection vectors is replaced with generated phrases. First, a centroid phrase that has the largest average semantic similarity to the images in the set is generated, where both the computation of the similarity and the generation are based on pretrained vision-language models. Then, the phrase that generates the highest variation among the similarity scores is generated, using the same models. The next phrase maximizes the variance subject to being orthogonal, in the latent space, to the highest-variance phrase, and the process continues. Our experiments show that our method is able to convincingly capture the essence of image sets and describe the individual elements in a semantically meaningful way within the context of the entire set. Our code is available at: this https URL.

Comments:	Accepted to Findings of EMNLP'22
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2210.12112 [cs.CV]
	(or arXiv:2210.12112v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2210.12112

Submission history

From: Oded Hupert [view email]
[v1] Fri, 21 Oct 2022 17:10:49 UTC (13,272 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Describing Sets of Images with Textual-PCA

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Describing Sets of Images with Textual-PCA

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators