Decentralized Cooperative Stochastic Multi-armed Bandits

Martínez-Rubio, David; Kanade, Varun; Rebeschini, Patrick

Computer Science > Machine Learning

arXiv:1810.04468v1 (cs)

[Submitted on 10 Oct 2018 (this version), latest version 24 Oct 2019 (v2)]

Title:Decentralized Cooperative Stochastic Multi-armed Bandits

Authors:David Martínez-Rubio, Varun Kanade, Patrick Rebeschini

View PDF

Abstract:We study a decentralized cooperative stochastic multi-armed bandit problem with $K$ arms on a network of $N$ agents. In our model, the reward distribution of each arm is agent-independent. Each agent chooses iteratively one arm to play and then communicates to her neighbors. The aim is to minimize the total network regret. We design a fully decentralized algorithm that uses a running consensus procedure to compute, with some delay, accurate estimations of the average of rewards obtained by all the agents for each arm, and then uses an upper confidence bound algorithm that accounts for the delay and error of the estimations. We analyze the algorithm and up to a constant our regret bounds are better for all networks than other algorithms designed to solve the same problem. For some graphs, our regret bounds are significantly better.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1810.04468 [cs.LG]
	(or arXiv:1810.04468v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1810.04468

Submission history

From: David Martínez-Rubio [view email]
[v1] Wed, 10 Oct 2018 11:46:20 UTC (26 KB)
[v2] Thu, 24 Oct 2019 13:19:01 UTC (315 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-10

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

David Martínez-Rubio
Varun Kanade
Patrick Rebeschini

export BibTeX citation

Computer Science > Machine Learning

Title:Decentralized Cooperative Stochastic Multi-armed Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Decentralized Cooperative Stochastic Multi-armed Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators