Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Carpentier, Alexandra; Lazaric, Alessandro; Ghavamzadeh, Mohammad; Munos, Rémi; Auer, Peter; Antos, András

Computer Science > Machine Learning

arXiv:1507.04523 (cs)

[Submitted on 16 Jul 2015]

Title:Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Authors:Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer, András Antos

View PDF

Abstract:In this paper, we study the problem of estimating uniformly well the mean values of several distributions given a finite budget of samples. If the variance of the distributions were known, one could design an optimal sampling strategy by collecting a number of independent samples per distribution that is proportional to their variance. However, in the more realistic case where the distributions are not known in advance, one needs to design adaptive sampling strategies in order to select which distribution to sample from according to the previously observed samples. We describe two strategies based on pulling the distributions a number of times that is proportional to a high-probability upper-confidence-bound on their variance (built from previous observed samples) and report a finite-sample performance analysis on the excess estimation error compared to the optimal allocation. We show that the performance of these allocation strategies depends not only on the variances but also on the full shape of the distributions.

Comments:	30 pages, 2 Postscript figures, uses this http URL, earlier, shorter version published in Proceedings of the 22nd International Conference, Algorithmic Learning Theory
Subjects:	Machine Learning (cs.LG)
ACM classes:	G.3
Cite as:	arXiv:1507.04523 [cs.LG]
	(or arXiv:1507.04523v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1507.04523

Submission history

From: András Antos [view email]
[v1] Thu, 16 Jul 2015 11:02:13 UTC (48 KB)

Computer Science > Machine Learning

Title:Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators