Kernel $\epsilon$-Greedy for Contextual Bandits

Arya, Sakshi; Sriperumbudur, Bharath K.

Statistics > Machine Learning

arXiv:2306.17329 (stat)

[Submitted on 29 Jun 2023]

Title:Kernel $ε$-Greedy for Contextual Bandits

Authors:Sakshi Arya, Bharath K. Sriperumbudur

View PDF

Abstract:We consider a kernelized version of the $\epsilon$-greedy strategy for contextual bandits. More precisely, in a setting with finitely many arms, we consider that the mean reward functions lie in a reproducing kernel Hilbert space (RKHS). We propose an online weighted kernel ridge regression estimator for the reward functions. Under some conditions on the exploration probability sequence, $\{\epsilon_t\}_t$, and choice of the regularization parameter, $\{\lambda_t\}_t$, we show that the proposed estimator is consistent. We also show that for any choice of kernel and the corresponding RKHS, we achieve a sub-linear regret rate depending on the intrinsic dimensionality of the RKHS. Furthermore, we achieve the optimal regret rate of $\sqrt{T}$ under a margin condition for finite-dimensional RKHS.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
MSC classes:	62L10, 62G05, 68T05
Cite as:	arXiv:2306.17329 [stat.ML]
	(or arXiv:2306.17329v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2306.17329

Submission history

From: Sakshi Arya [view email]
[v1] Thu, 29 Jun 2023 22:48:34 UTC (1,415 KB)

Full-text links:

Access Paper:

view license

Current browse context:

math

< prev | next >

new | recent | 2023-06

Change to browse by:

cs
cs.LG
math.ST
stat
stat.ML
stat.TH

References & Citations

export BibTeX citation

Statistics > Machine Learning

Title:Kernel $ε$-Greedy for Contextual Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Kernel $ε$-Greedy for Contextual Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators