Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

Cuvelier, Thibaut; Combes, Richard; Gourdin, Eric

Statistics > Machine Learning

arXiv:2002.07258v1 (stat)

[Submitted on 17 Feb 2020 (this version), latest version 13 Jan 2021 (v2)]

Title:Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

Authors:Thibaut Cuvelier, Richard Combes, Eric Gourdin

View PDF

Abstract:We consider combinatorial semi-bandits over a set of arms ${\cal X} \subset \{0,1\}^d$ where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound $R(T) = {\cal O}\Big( {d (\ln m)^2 (\ln T) \over \Delta_{\min} }\Big)$, but it has computational complexity ${\cal O}(|{\cal X}|)$ which is typically exponential in $d$, and cannot be used in large dimensions. We propose the first algorithm which is both computationally and statistically efficient for this problem with regret $R(T) = {\cal O} \Big({d (\ln m)^2 (\ln T)\over \Delta_{\min} }\Big)$ and computational complexity ${\cal O}(T {\bf poly}(d))$. Our approach involves carefully designing an approximate version of ESCB with the same regret guarantees, showing that this approximate algorithm can be implemented in time ${\cal O}(T {\bf poly}(d))$ by repeatedly maximizing a linear function over ${\cal X}$ subject to a linear budget constraint, and showing how to solve this maximization problems efficiently.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2002.07258 [stat.ML]
	(or arXiv:2002.07258v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2002.07258

Submission history

From: Richard Combes [view email]
[v1] Mon, 17 Feb 2020 21:32:04 UTC (77 KB)
[v2] Wed, 13 Jan 2021 17:12:58 UTC (407 KB)

Statistics > Machine Learning

Title:Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators