On the Nonsmooth Geometry and Neural Approximation of the Optimal Value Function of Infinite-Horizon Pendulum Swing-up

Han, Haoyu; Yang, Heng

Mathematics > Optimization and Control

arXiv:2312.17467 (math)

[Submitted on 29 Dec 2023 (v1), last revised 2 Aug 2024 (this version, v3)]

Title:On the Nonsmooth Geometry and Neural Approximation of the Optimal Value Function of Infinite-Horizon Pendulum Swing-up

Authors:Haoyu Han, Heng Yang

View PDF HTML (experimental)

Abstract:We revisit the inverted pendulum problem with the goal of understanding and computing the true optimal value function. We start with an observation that the true optimal value function must be nonsmooth ($i.e.$, not globally $C^1$) due to the symmetry of the problem. We then give a result that can certify the optimality of a candidate $\textit{piece-wise}$ $C^1$ value function. Further, for a candidate value function obtained via numerical approximation, we provide a bound of suboptimality based on its Hamilton-Jacobi-Bellman (HJB) equation residuals. Inspired by Holzhuter (2004), we then design an algorithm that solves backward the Pontryagin's minimum principle (PMP) ODE from terminal conditions provided by the locally optimal LQR value function. This numerical procedure leads to a piece-wise $C^1$ value function whose nonsmooth region contains periodic $\textit{spiral lines}$ and smooth regions attain HJB residuals about $10^{-4}$, hence certified to be the optimal value function up to minor numerical inaccuracies. This optimal value function checks the power of optimality: (i) it sits above a polynomial lower bound; (ii) its induced controller globally swings up and stabilizes the pendulum, and (iii) attains lower trajectory cost than baseline methods such as energy shaping, model predictive control (MPC), and proximal policy optimization (with MPC attaining almost the same cost). We conclude by distilling the optimal value function into a simple neural network. Our code is avilable in this https URL.

Comments:	Published on 6th Learning for Dynamics and Control (L4DC)
Subjects:	Optimization and Control (math.OC)
Cite as:	arXiv:2312.17467 [math.OC]
	(or arXiv:2312.17467v3 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2312.17467

Submission history

From: Haoyu Han [view email]
[v1] Fri, 29 Dec 2023 04:32:20 UTC (43,735 KB)
[v2] Sat, 25 May 2024 11:34:54 UTC (43,735 KB)
[v3] Fri, 2 Aug 2024 12:05:57 UTC (43,735 KB)

Mathematics > Optimization and Control

Title:On the Nonsmooth Geometry and Neural Approximation of the Optimal Value Function of Infinite-Horizon Pendulum Swing-up

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:On the Nonsmooth Geometry and Neural Approximation of the Optimal Value Function of Infinite-Horizon Pendulum Swing-up

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators