Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

Liu, Fang; Zheng, Zizhan; Shroff, Ness

Statistics > Machine Learning

arXiv:1805.08930 (stat)

[Submitted on 23 May 2018]

Title:Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

Authors:Fang Liu, Zizhan Zheng, Ness Shroff

View PDF

Abstract:We study multi-armed bandit problems with graph feedback, in which the decision maker is allowed to observe the neighboring actions of the chosen action, in a setting where the graph may vary over time and is never fully revealed to the decision maker. We show that when the feedback graphs are undirected, the original Thompson Sampling achieves the optimal (within logarithmic factors) regret $\tilde{O}\left(\sqrt{\beta_0(G)T}\right)$ over time horizon $T$, where $\beta_0(G)$ is the average independence number of the latent graphs. To the best of our knowledge, this is the first result showing that the original Thompson Sampling is optimal for graphical bandits in the undirected setting. A slightly weaker regret bound of Thompson Sampling in the directed setting is also presented. To fill this gap, we propose a variant of Thompson Sampling, that attains the optimal regret in the directed setting within a logarithmic factor. Both algorithms can be implemented efficiently and do not require the knowledge of the feedback graphs at any time.

Comments:	Accepted by UAI 2018
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1805.08930 [stat.ML]
	(or arXiv:1805.08930v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1805.08930

Submission history

From: Fang Liu [view email]
[v1] Wed, 23 May 2018 01:47:56 UTC (62 KB)

Statistics > Machine Learning

Title:Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators