Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

He, Yuchen; Zhang, Chihao

Computer Science > Machine Learning

arXiv:2205.15076 (cs)

[Submitted on 30 May 2022 (v1), last revised 4 Aug 2023 (this version, v2)]

Title:Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

Authors:Yuchen He, Chihao Zhang

View PDF

Abstract:The problem of bandit with graph feedback generalizes both the multi-armed bandit (MAB) problem and the learning with expert advice problem by encoding in a directed graph how the loss vector can be observed in each round of the game. The mini-max regret is closely related to the structure of the feedback graph and their connection is far from being fully understood. We propose a new algorithmic framework for the problem based on a partition of the feedback graph. Our analysis reveals the interplay between various parts of the graph by decomposing the regret to the sum of the regret caused by small parts and the regret caused by their interaction. As a result, our algorithm can be viewed as an interpolation and generalization of the optimal algorithms for MAB and learning with expert advice. Our framework unifies previous algorithms for both strongly observable graphs and weakly observable graphs, resulting in improved and optimal regret bounds on a wide range of graph families including graphs of bounded degree and strongly observable graphs with a few corrupted arms.

Subjects:	Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
Cite as:	arXiv:2205.15076 [cs.LG]
	(or arXiv:2205.15076v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.15076

Submission history

From: Yuchen He [view email]
[v1] Mon, 30 May 2022 13:07:42 UTC (36 KB)
[v2] Fri, 4 Aug 2023 05:13:42 UTC (36 KB)

Computer Science > Machine Learning

Title:Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators