Mathematics > Optimization and Control
[Submitted on 26 Jan 2023 (v1), last revised 18 Dec 2024 (this version, v3)]
Title:Another Look at Partially Observed Optimal Stochastic Control: Existence, Ergodicity, and Approximations without Belief-Reduction
View PDF HTML (experimental)Abstract:We present an alternative view for the study of optimal control of partially observed Markov Decision Processes (POMDPs). We first revisit the traditional (and by now standard) separated-design method of reducing the problem to fully observed MDPs (belief-MDPs), and present conditions for the existence of optimal policies. Then, rather than working with this standard method, we define a Markov chain taking values in an infinite dimensional product space with the history process serving as the controlled state process and a further refinement in which the control actions and the state process are causally conditionally independent given the measurement/information process. We provide new sufficient conditions for the existence of optimal control policies under the discounted cost and average cost infinite horizon criteria. For the discounted cost setup, we establish near optimality of finite window policies via a direct argument involving near optimality of quantized approximations for MDPs under weak Feller continuity, where finite truncations of memory can be viewed as quantizations of infinite memory with a uniform diameter in each finite window restriction under the product metric. For the average cost setup, we provide new existence conditions and also a general approach on how to initialize the randomness which we show to establish convergence to optimal cost. In the control-free case, our analysis leads to new and weak conditions for the existence and uniqueness of invariant probability measures for nonlinear filter processes.
Submission history
From: Serdar Yüksel [view email][v1] Thu, 26 Jan 2023 17:26:42 UTC (73 KB)
[v2] Tue, 6 Feb 2024 15:48:07 UTC (70 KB)
[v3] Wed, 18 Dec 2024 23:22:40 UTC (82 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.