Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

Shi, Ming; Liang, Yingbin; Shroff, Ness

Computer Science > Machine Learning

arXiv:2306.08762 (cs)

[Submitted on 14 Jun 2023 (v1), last revised 11 Mar 2024 (this version, v3)]

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

Authors:Ming Shi, Yingbin Liang, Ness Shroff

View PDF

Abstract:Partially observable Markov decision processes (POMDPs) have been widely applied in various real-world applications. However, existing theoretical results have shown that learning in POMDPs is intractable in the worst case, where the main challenge lies in the lack of latent state information. A key fundamental question here is: how much online state information (OSI) is sufficient to achieve tractability? In this paper, we establish a lower bound that reveals a surprising hardness result: unless we have full OSI, we need an exponentially scaling sample complexity to obtain an $\epsilon$-optimal policy solution for POMDPs. Nonetheless, inspired by the insights in our lower-bound design, we identify important tractable subclasses of POMDPs, even with only partial OSI. In particular, for two subclasses of POMDPs with partial OSI, we provide new algorithms that are proved to be near-optimal by establishing new regret upper and lower bounds. Both our algorithm design and regret analysis involve non-trivial developments for joint OSI query and action control.

Comments:	Submitted for publication
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.08762 [cs.LG]
	(or arXiv:2306.08762v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.08762

Submission history

From: Ming Shi [view email]
[v1] Wed, 14 Jun 2023 22:20:46 UTC (101 KB)
[v2] Thu, 12 Oct 2023 00:07:07 UTC (108 KB)
[v3] Mon, 11 Mar 2024 19:13:04 UTC (110 KB)

Computer Science > Machine Learning

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators