Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Turchetta, Matteo; Berkenkamp, Felix; Krause, Andreas

Computer Science > Machine Learning

arXiv:1606.04753 (cs)

[Submitted on 15 Jun 2016 (v1), last revised 15 Nov 2016 (this version, v2)]

Title:Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Authors:Matteo Turchetta, Felix Berkenkamp, Andreas Krause

View PDF

Abstract:In classical reinforcement learning, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for safety critical applications, such as robotics, where even a single unsafe action may cause system failure. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an, a priori unknown, safety constraint that depends on states and actions. We aim to explore the MDP under this constraint, assuming that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm for this task and prove that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint. To achieve this, it cautiously explores safe states and actions in order to gain statistical confidence about the safety of unvisited state-action pairs from noisy observations collected while navigating the environment. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.

Comments:	15 pages, extended version with proofs
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:1606.04753 [cs.LG]
	(or arXiv:1606.04753v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1606.04753
Journal reference:	Proc. of Advances in Neural Information Processing Systems (NIPS), 2016, pp. 4305-4313

Submission history

From: Matteo Turchetta [view email]
[v1] Wed, 15 Jun 2016 13:18:30 UTC (505 KB)
[v2] Tue, 15 Nov 2016 14:00:11 UTC (506 KB)

Computer Science > Machine Learning

Title:Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators