Efficient Contextual Bandits with Uninformed Feedback Graphs

Zhang, Mengxiao; Zhang, Yuheng; Luo, Haipeng; Mineiro, Paul

Computer Science > Machine Learning

arXiv:2402.08127 (cs)

[Submitted on 12 Feb 2024]

Title:Efficient Contextual Bandits with Uninformed Feedback Graphs

Authors:Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, Paul Mineiro

View PDF HTML (experimental)

Abstract:Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems, capturing many real-life applications. A recent work by Zhang et al. (2023) studies the contextual version of this problem and proposes an efficient and optimal algorithm via a reduction to online regression. However, their algorithm crucially relies on seeing the feedback graph before making each decision, while in many applications, the feedback graph is uninformed, meaning that it is either only revealed after the learner makes her decision or even never fully revealed at all. This work develops the first contextual algorithm for such uninformed settings, via an efficient reduction to online regression over both the losses and the graphs. Importantly, we show that it is critical to learn the graphs using log loss instead of squared loss to obtain favorable regret guarantees. We also demonstrate the empirical effectiveness of our algorithm on a bidding application using both synthetic and real-world data.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2402.08127 [cs.LG]
	(or arXiv:2402.08127v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.08127

Submission history

From: Mengxiao Zhang [view email]
[v1] Mon, 12 Feb 2024 23:50:47 UTC (3,504 KB)

Computer Science > Machine Learning

Title:Efficient Contextual Bandits with Uninformed Feedback Graphs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Contextual Bandits with Uninformed Feedback Graphs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators