Safe end-to-end imitation learning for model predictive control

Lee, Keuntaek; Saigol, Kamil; Theodorou, Evangelos A.

Computer Science > Machine Learning

arXiv:1803.10231 (cs)

[Submitted on 27 Mar 2018 (v1), last revised 15 Feb 2019 (this version, v3)]

Title:Safe end-to-end imitation learning for model predictive control

Authors:Keuntaek Lee, Kamil Saigol, Evangelos A. Theodorou

View PDF

Abstract:We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1803.10231 [cs.LG]
	(or arXiv:1803.10231v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1803.10231

Submission history

From: Kamil Saigol [view email]
[v1] Tue, 27 Mar 2018 15:47:29 UTC (4,422 KB)
[v2] Sun, 14 Oct 2018 00:46:27 UTC (4,422 KB)
[v3] Fri, 15 Feb 2019 03:37:16 UTC (4,423 KB)

Computer Science > Machine Learning

Title:Safe end-to-end imitation learning for model predictive control

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Safe end-to-end imitation learning for model predictive control

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators