Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Shi, Ming; Liang, Yingbin; Shroff, Ness

Computer Science > Machine Learning

arXiv:2306.08762v1 (cs)

[Submitted on 14 Jun 2023 (this version), latest version 11 Mar 2024 (v3)]

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Authors:Ming Shi, Yingbin Liang, Ness Shroff

View PDF

Abstract:Partially observable Markov decision processes (POMDPs) have been widely applied to capture many real-world applications. However, existing theoretical results have shown that learning in general POMDPs could be intractable, where the main challenge lies in the lack of latent state information. A key fundamental question here is how much hindsight state information (HSI) is sufficient to achieve tractability. In this paper, we establish a lower bound that reveals a surprising hardness result: unless we have full HSI, we need an exponentially scaling sample complexity to obtain an $\epsilon$-optimal policy solution for POMDPs. Nonetheless, from the key insights in our lower-bound construction, we find that there exist important tractable classes of POMDPs even with partial HSI. In particular, for two novel classes of POMDPs with partial HSI, we provide new algorithms that are shown to be near-optimal by establishing new upper and lower bounds.

Comments:	Submitted for publication
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.08762 [cs.LG]
	(or arXiv:2306.08762v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.08762

Submission history

From: Ming Shi [view email]
[v1] Wed, 14 Jun 2023 22:20:46 UTC (101 KB)
[v2] Thu, 12 Oct 2023 00:07:07 UTC (108 KB)
[v3] Mon, 11 Mar 2024 19:13:04 UTC (110 KB)

Computer Science > Machine Learning

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Hindsight State Information

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators