Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Jourdan, Marc; Degenne, Rémy

Statistics > Machine Learning

arXiv:2206.04456 (stat)

[Submitted on 9 Jun 2022]

Title:Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Authors:Marc Jourdan, Rémy Degenne

View PDF

Abstract:In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been dedicated to identifying one arm that is $\varepsilon$-close to the best one (and not exactly the best one). In this problem with several correct answers, an identification algorithm should focus on one candidate among those answers and verify that it is correct. We demonstrate that picking the answer with highest mean does not allow an algorithm to reach asymptotic optimality in terms of expected sample complexity. Instead, a \textit{furthest answer} should be identified. Using that insight to choose the candidate answer carefully, we develop a simple procedure to adapt best-arm identification algorithms to tackle $\varepsilon$-best-answer identification in transductive linear stochastic bandits. Finally, we propose an asymptotically optimal algorithm for this setting, which is shown to achieve competitive empirical performance against existing modified best-arm identification algorithms.

Comments:	47 pages, 10 figures, 8 tables. To be published in the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2206.04456 [stat.ML]
	(or arXiv:2206.04456v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2206.04456

Submission history

From: Marc Jourdan [view email]
[v1] Thu, 9 Jun 2022 12:27:51 UTC (605 KB)

Statistics > Machine Learning

Title:Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators