Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Critch, Andrew

Computer Science > Artificial Intelligence

arXiv:1701.01302 (cs)

[Submitted on 5 Jan 2017 (v1), last revised 13 May 2017 (this version, v3)]

Title:Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Authors:Andrew Critch

View PDF

Abstract:Existing multi-objective reinforcement learning (MORL) algorithms do not account for objectives that arise from players with differing beliefs. Concretely, consider two players with different beliefs and utility functions who may cooperate to build a machine that takes actions on their behalf. A representation is needed for how much the machine's policy will prioritize each player's interests over time. Assuming the players have reached common knowledge of their situation, this paper derives a recursion that any Pareto optimal policy must satisfy. Two qualitative observations can be made from the recursion: the machine must (1) use each player's own beliefs in evaluating how well an action will serve that player's utility function, and (2) shift the relative priority it assigns to each player's expected utilities over time, by a factor proportional to how well that player's beliefs predict the machine's inputs. Observation (2) represents a substantial divergence from naïve linear utility aggregation (as in Harsanyi's utilitarian theorem, and existing MORL algorithms), which is shown here to be inadequate for Pareto optimal sequential decision-making on behalf of players with different beliefs.

Subjects:	Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG)
Cite as:	arXiv:1701.01302 [cs.AI]
	(or arXiv:1701.01302v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1701.01302

Submission history

From: Andrew Critch PhD [view email]
[v1] Thu, 5 Jan 2017 13:00:05 UTC (20 KB)
[v2] Tue, 10 Jan 2017 16:06:30 UTC (21 KB)
[v3] Sat, 13 May 2017 08:33:46 UTC (21 KB)

Computer Science > Artificial Intelligence

Title:Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Toward negotiable reinforcement learning: shifting priorities in Pareto optimal sequential decision-making

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators