Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem

Tian, Ran; Sun, Liting; Tomizuka, Masayoshi

Computer Science > Machine Learning

arXiv:2009.01495v1 (cs)

[Submitted on 3 Sep 2020 (this version), latest version 21 Mar 2021 (v7)]

Title:Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem

Authors:Ran Tian, Liting Sun, Masayoshi Tomizuka

View PDF

Abstract:Classical game-theoretic approaches for multi-agent systems in both the forward policy learning/design problem and the inverse reward learning problem often make strong rationality assumptions: agents are perfectly rational expected utility maximizers. Specifically, the agents are risk-neutral to all uncertainties, maximize their expected rewards, and have unlimited computation resources to explore such policies. Such assumptions, however, substantially mismatch with many observed humans' behaviors such as satisficing with sub-optimal policies, risk-seeking and loss-aversion decisions. In this paper, we investigate the problem of bounded risk-sensitive Markov Game (BRSMG) and its inverse reward learning problem. Instead of assuming unlimited computation resources, we consider the influence of bounded intelligence by exploiting iterative reasoning models in BRSMG. Instead of assuming agents maximize their expected utilities (a risk-neutral measure), we consider the impact of risk-sensitive measures such as the cumulative prospect theory. Convergence analysis of BRSMG for both the forward policy learning and the inverse reward learning are established. The proposed forward policy learning and inverse reward learning algorithms in BRSMG are validated through a navigation scenario. Simulation results show that the behaviors of agents in BRSMG demonstrate both risk-averse and risk-seeking phenomena, which are consistent with observations from humans. Moreover, in the inverse reward learning task, the proposed bounded risk-sensitive inverse learning algorithm outperforms the baseline risk-neutral inverse learning algorithm.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2009.01495 [cs.LG]
	(or arXiv:2009.01495v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2009.01495

Submission history

From: Ran Tian [view email]
[v1] Thu, 3 Sep 2020 07:32:32 UTC (2,420 KB)
[v2] Sat, 5 Sep 2020 07:23:16 UTC (2,446 KB)
[v3] Sun, 8 Nov 2020 07:32:00 UTC (2,451 KB)
[v4] Sun, 15 Nov 2020 19:04:59 UTC (2,451 KB)
[v5] Sat, 19 Dec 2020 04:55:32 UTC (2,445 KB)
[v6] Sat, 13 Feb 2021 04:01:25 UTC (2,448 KB)
[v7] Sun, 21 Mar 2021 02:10:20 UTC (1,884 KB)

Computer Science > Machine Learning

Title:Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Bounded Risk-Sensitive Markov Game and Its Inverse Reward Learning Problem

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators