J-P: MDP. FP. PP.: Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs (Technical Report)

Batz, Kevin; Kaminski, Benjamin Lucien; Matheja, Christoph; Winkler, Tobias

doi:10.1007/978-3-031-75783-9_11

Computer Science > Logic in Computer Science

arXiv:2411.16564 (cs)

[Submitted on 25 Nov 2024]

Title:J-P: MDP. FP. PP.: Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs (Technical Report)

Authors:Kevin Batz, Benjamin Lucien Kaminski, Christoph Matheja, Tobias Winkler

View PDF HTML (experimental)

Abstract:Markov decision processes (MDPs) with rewards are a widespread and well-studied model for systems that make both probabilistic and nondeterministic choices. A fundamental result about MDPs is that their minimal and maximal expected rewards satisfy Bellmann's optimality equations. For various classes of MDPs - notably finite-state MDPs, positive bounded models, and negative models - expected rewards are known to be the least solution of those equations. However, these classes of MDPs are too restrictive for probabilistic program verification. In particular, they assume that all rewards are finite. This is already not the case for the expected runtime of a simple probabilisitic program modeling a 1-dimensional random walk.
In this paper, we develop a generalized least fixed point characterization of expected rewards in MDPs without those restrictions. Furthermore, we demonstrate how said characterization can be leveraged to prove weakest-preexpectation-style calculi sound with respect to an operational MDP model.

Subjects:	Logic in Computer Science (cs.LO); Programming Languages (cs.PL)
Cite as:	arXiv:2411.16564 [cs.LO]
	(or arXiv:2411.16564v1 [cs.LO] for this version)
	https://doi.org/10.48550/arXiv.2411.16564
Related DOI:	https://doi.org/10.1007/978-3-031-75783-9_11

Submission history

From: Kevin Batz [view email]
[v1] Mon, 25 Nov 2024 16:49:10 UTC (290 KB)

Computer Science > Logic in Computer Science

Title:J-P: MDP. FP. PP.: Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs (Technical Report)

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Logic in Computer Science

Title:J-P: MDP. FP. PP.: Characterizing Total Expected Rewards in Markov Decision Processes as Least Fixed Points with an Application to Operational Semantics of Probabilistic Programs (Technical Report)

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators