Structure Adaptive Algorithms for Stochastic Bandits

Degenne, Rémy; Shao, Han; Koolen, Wouter M.

Statistics > Machine Learning

arXiv:2007.00969 (stat)

[Submitted on 2 Jul 2020]

Title:Structure Adaptive Algorithms for Stochastic Bandits

Authors:Rémy Degenne, Han Shao, Wouter M. Koolen

View PDF

Abstract:We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small.
We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds.
Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB.

Comments:	10+18 pages. To be published in the proceedings of ICML 2020
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2007.00969 [stat.ML]
	(or arXiv:2007.00969v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2007.00969

Submission history

From: Rémy Degenne [view email]
[v1] Thu, 2 Jul 2020 08:59:54 UTC (5,871 KB)

Statistics > Machine Learning

Title:Structure Adaptive Algorithms for Stochastic Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Structure Adaptive Algorithms for Stochastic Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators