Stochastic Linear Bandits with Protected Subspace

Parulekar, Advait; Basu, Soumya; Gopalan, Aditya; Shanmugam, Karthikeyan; Shakkottai, Sanjay

Computer Science > Machine Learning

arXiv:2011.01016v1 (cs)

[Submitted on 2 Nov 2020 (this version), latest version 1 Mar 2021 (v2)]

Title:Stochastic Linear Bandits with Protected Subspace

Authors:Advait Parulekar, Soumya Basu, Aditya Gopalan, Karthikeyan Shanmugam, Sanjay Shakkottai

View PDF

Abstract:We study a variant of the stochastic linear bandit problem wherein we optimize a linear objective function but rewards are accrued only orthogonal to an unknown subspace (which we interpret as a \textit{protected space}) given only zero-order stochastic oracle access to both the objective itself and protected subspace. In particular, at each round, the learner must choose whether to query the objective or the protected subspace alongside choosing an action. Our algorithm, derived from the OFUL principle, uses some of the queries to get an estimate of the protected space, and (in almost all rounds) plays optimistically with respect to a confidence set for this space. We provide a $\tilde{O}(sd\sqrt{T})$ regret upper bound in the case where the action space is the complete unit ball in $\mathbb{R}^d$, $s < d$ is the dimension of the protected subspace, and $T$ is the time horizon. Moreover, we demonstrate that a discrete action space can lead to linear regret with an optimistic algorithm, reinforcing the sub-optimality of optimism in certain settings. We also show that protection constraints imply that for certain settings, no consistent algorithm can have a regret smaller than $\Omega(T^{3/4}).$ We finally empirically validate our results with synthetic and real datasets.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2011.01016 [cs.LG]
	(or arXiv:2011.01016v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2011.01016

Submission history

From: Advait Parulekar [view email]
[v1] Mon, 2 Nov 2020 14:59:39 UTC (199 KB)
[v2] Mon, 1 Mar 2021 21:40:26 UTC (224 KB)

Computer Science > Machine Learning

Title:Stochastic Linear Bandits with Protected Subspace

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Stochastic Linear Bandits with Protected Subspace

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators