Multi-Action Restless Bandits with Weakly Coupled Constraints: Simultaneous Learning and Control

Fu, Jing; Moran, Bill; Niño-Mora, José

Mathematics > Optimization and Control

arXiv:2412.03326 (math)

[Submitted on 4 Dec 2024]

Title:Multi-Action Restless Bandits with Weakly Coupled Constraints: Simultaneous Learning and Control

Authors:Jing Fu, Bill Moran, José Niño-Mora

View PDF HTML (experimental)

Abstract:We study a system with finitely many groups of multi-action bandit processes, each of which is a Markov decision process (MDP) with finite state and action spaces and potentially different transition matrices when taking different actions. The bandit processes of the same group share the same state and action spaces and, given the same action that is taken, the same transition matrix. All the bandit processes across various groups are subject to multiple weakly coupled constraints over their state and action variables. Unlike the past studies that focused on the offline case, we consider the online case without assuming full knowledge of transition matrices and reward functions a priori and propose an effective scheme that enables simultaneous learning and control. We prove the convergence of the relevant processes in both the timeline and the number of the bandit processes, referred to as the convergence in the time and the magnitude dimensions. Moreover, we prove that the relevant processes converge exponentially fast in the magnitude dimension, leading to exponentially diminishing performance deviation between the proposed online algorithms and offline optimality.

Comments:	70 pages,0 figure
Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG); Probability (math.PR)
MSC classes:	90B36 (Primary) 90B15, 90B22 (Secondary)
Cite as:	arXiv:2412.03326 [math.OC]
	(or arXiv:2412.03326v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2412.03326

Submission history

From: Jing Fu [view email]
[v1] Wed, 4 Dec 2024 13:57:20 UTC (126 KB)

Mathematics > Optimization and Control

Title:Multi-Action Restless Bandits with Weakly Coupled Constraints: Simultaneous Learning and Control

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Multi-Action Restless Bandits with Weakly Coupled Constraints: Simultaneous Learning and Control

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators