Imitation Learning by Reinforcement Learning

Ciosek, Kamil

Statistics > Machine Learning

arXiv:2108.04763 (stat)

[Submitted on 10 Aug 2021 (v1), last revised 15 Mar 2022 (this version, v2)]

Title:Imitation Learning by Reinforcement Learning

Authors:Kamil Ciosek

View PDF

Abstract:Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.

Comments:	Published in ICLR 2022
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2108.04763 [stat.ML]
	(or arXiv:2108.04763v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2108.04763

Submission history

From: Kamil Ciosek [view email]
[v1] Tue, 10 Aug 2021 16:14:41 UTC (191 KB)
[v2] Tue, 15 Mar 2022 14:39:22 UTC (401 KB)

Full-text links:

Access Paper:

view license

Current browse context:

stat

< prev | next >

new | recent | 2021-08

Change to browse by:

cs
cs.LG
stat.ML

References & Citations

export BibTeX citation

Statistics > Machine Learning

Title:Imitation Learning by Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Imitation Learning by Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators