Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Fahim, Arash; Rahman, Md. Arafatur

Mathematics > Optimization and Control

arXiv:2502.14141 (math)

[Submitted on 19 Feb 2025]

Title:Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Authors:Arash Fahim, Md. Arafatur Rahman

View PDF HTML (experimental)

Abstract:In this paper, we propose an efficient implementation of deep policy gradient method (PGM) for optimal control problems in continuous time. The proposed method has the ability to manage the allocation of computational resources, number of trajectories, and complexity of architecture of the neural network. This is, in particular, important for continuous-time problems that require a fine time discretization. Each step of this method focuses on a different time scale and learns a policy, modeled by a neural network, for a discretized optimal control problem. The first step has the coarsest time discretization. As we proceed to other steps, the time discretization becomes finer. The optimal trained policy in each step is also used to provide data for the next step. We accompany the multi-scale deep PGM with a theoretical result on allocation of computational resources to obtain a targeted efficiency and test our methods on the linear-quadratic stochastic optimal control problem.

Comments:	20 pages, 4 figures
Subjects:	Optimization and Control (math.OC); Computational Finance (q-fin.CP)
MSC classes:	49M25, 90-08
Cite as:	arXiv:2502.14141 [math.OC]
	(or arXiv:2502.14141v1 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2502.14141

Submission history

From: Arash Fahim [view email]
[v1] Wed, 19 Feb 2025 22:56:44 UTC (939 KB)

Mathematics > Optimization and Control

Title:Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:Gaining efficiency in deep policy gradient method for continuous-time optimal control problems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators