Policy Transfer with Strategy Optimization

Yu, Wenhao; Liu, C. Karen; Turk, Greg

Computer Science > Machine Learning

arXiv:1810.05751 (cs)

[Submitted on 12 Oct 2018 (v1), last revised 4 Dec 2018 (this version, v2)]

Title:Policy Transfer with Strategy Optimization

Authors:Wenhao Yu, C. Karen Liu, Greg Turk

View PDF

Abstract:Computer simulation provides an automatic and safe way for training robotic control policies to achieve complex tasks such as locomotion. However, a policy trained in simulation usually does not transfer directly to the real hardware due to the differences between the two environments. Transfer learning using domain randomization is a promising approach, but it usually assumes that the target environment is close to the distribution of the training environments, thus relying heavily on accurate system identification. In this paper, we present a different approach that leverages domain randomization for transferring control policies to unknown environments. The key idea that, instead of learning a single policy in the simulation, we simultaneously learn a family of policies that exhibit different behaviors. When tested in the target environment, we directly search for the best policy in the family based on the task performance, without the need to identify the dynamic parameters. We evaluate our method on five simulated robotic control problems with different discrepancies in the training and testing environment and demonstrate that our method can overcome larger modeling errors compared to training a robust policy or an adaptive policy.

Subjects:	Machine Learning (cs.LG); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:1810.05751 [cs.LG]
	(or arXiv:1810.05751v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1810.05751

Submission history

From: Wenhao Yu [view email]
[v1] Fri, 12 Oct 2018 22:53:30 UTC (4,307 KB)
[v2] Tue, 4 Dec 2018 16:36:47 UTC (2,560 KB)

Computer Science > Machine Learning

Title:Policy Transfer with Strategy Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Policy Transfer with Strategy Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators