Stochastic bandits with arm-dependent delays

Manegueu, Anne Gael; Vernade, Claire; Carpentier, Alexandra; Valko, Michal

Statistics > Machine Learning

arXiv:2006.10459 (stat)

[Submitted on 18 Jun 2020]

Title:Stochastic bandits with arm-dependent delays

Authors:Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko

View PDF

Abstract:Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds.

Comments:	19 Pages, 4 figures
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
MSC classes:	62L10
Cite as:	arXiv:2006.10459 [stat.ML]
	(or arXiv:2006.10459v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2006.10459

Submission history

From: Anne Gael Manegueu Anne [view email]
[v1] Thu, 18 Jun 2020 12:13:58 UTC (966 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat.ML

< prev | next >

new | recent | 2020-06

Change to browse by:

cs
cs.LG
stat

References & Citations

export BibTeX citation

Statistics > Machine Learning

Title:Stochastic bandits with arm-dependent delays

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Stochastic bandits with arm-dependent delays

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators