Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

Kaushal, Vishal; Sahoo, Anurag; Doctor, Khoshrav; Raju, Narasimha; Shetty, Suyash; Singh, Pankaj; Iyer, Rishabh; Ramakrishnan, Ganesh

Computer Science > Computer Vision and Pattern Recognition

arXiv:1805.11191 (cs)

[Submitted on 28 May 2018]

Title:Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

Authors:Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan

View PDF

Abstract:Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges respectively. A special class of subset selection functions naturally model notions of diversity, coverage and representation and they can be used to eliminate redundancy and thus lend themselves well for training data subset selection. They can also help improve the efficiency of active learning in further reducing human labeling efforts by selecting a subset of the examples obtained using the conventional uncertainty sampling based techniques. In this work we empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Disparity-Min models for training-data subset selection and reducing labeling effort. We do this for a variety of computer vision tasks including Gender Recognition, Scene Recognition and Object Recognition. Our results show that subset selection done in the right way can add 2-3% in accuracy on existing baselines, particularly in the case of less training data. This allows the training of complex machine learning models (like Convolutional Neural Networks) with much less training data while incurring minimal performance loss.

Comments:	15 pages, 7 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1805.11191 [cs.CV]
	(or arXiv:1805.11191v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1805.11191

Submission history

From: Anurag Sahoo [view email]
[v1] Mon, 28 May 2018 22:27:29 UTC (659 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators