Operator Splitting for Convex Constrained Markov Decision Processes

Grontas, Panagiotis D.; Tsiamis, Anastasios; Lygeros, John

Mathematics > Optimization and Control

arXiv:2412.14002 (math)

[Submitted on 18 Dec 2024]

Title:Operator Splitting for Convex Constrained Markov Decision Processes

Authors:Panagiotis D. Grontas, Anastasios Tsiamis, John Lygeros

View PDF HTML (experimental)

Abstract:We consider finite Markov decision processes (MDPs) with convex constraints and known dynamics. In principle, this problem is amenable to off-the-shelf convex optimization solvers, but typically this approach suffers from poor scalability. In this work, we develop a first-order algorithm, based on the Douglas-Rachford splitting, that allows us to decompose the dynamics and constraints. Thanks to this decoupling, we can incorporate a wide variety of convex constraints. Our scheme consists of simple and easy-to-implement updates that alternate between solving a regularized MDP and a projection. The inherent presence of regularized updates ensures last-iterate convergence, numerical stability, and, contrary to existing approaches, does not require us to regularize the problem explicitly. If the constraints are not attainable, we exploit salient properties of the Douglas-Rachord algorithm to detect infeasibility and compute a policy that minimally violates the constraints. We demonstrate the performance of our algorithm on two benchmark problems and show that it compares favorably to competing approaches.

Comments:	Submitted to IEEE Transactions on Automatic Control
Subjects:	Optimization and Control (math.OC); Systems and Control (eess.SY)
Cite as:	arXiv:2412.14002 [math.OC]
	(or arXiv:2412.14002v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2412.14002

Submission history

From: Panagiotis Grontas [view email]
[v1] Wed, 18 Dec 2024 16:17:25 UTC (939 KB)

Mathematics > Optimization and Control

Title:Operator Splitting for Convex Constrained Markov Decision Processes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Operator Splitting for Convex Constrained Markov Decision Processes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators