Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Tidd, Brendan; Hudson, Nicolas; Cosgun, Akansel; Leitner, Jurgen

Computer Science > Robotics

arXiv:2011.00440 (cs)

[Submitted on 1 Nov 2020 (v1), last revised 29 Sep 2021 (this version, v2)]

Title:Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Authors:Brendan Tidd, Nicolas Hudson, Akansel Cosgun, Jurgen Leitner

View PDF

Abstract:Legged robots often use separate control policiesthat are highly engineered for traversing difficult terrain suchas stairs, gaps, and steps, where switching between policies isonly possible when the robot is in a region that is commonto adjacent controllers. Deep Reinforcement Learning (DRL)is a promising alternative to hand-crafted control design,though typically requires the full set of test conditions to beknown before training. DRL policies can result in complex(often unrealistic) behaviours that have few or no overlappingregions between adjacent policies, making it difficult to switchbehaviours. In this work we develop multiple DRL policieswith Curriculum Learning (CL), each that can traverse asingle respective terrain condition, while ensuring an overlapbetween policies. We then train a network for each destinationpolicy that estimates the likelihood of successfully switchingfrom any other policy. We evaluate our switching methodon a previously unseen combination of terrain artifacts andshow that it performs better than heuristic methods. Whileour method is trained on individual terrain types, it performscomparably to a Deep Q Network trained on the full set ofterrain conditions. This approach allows the development ofseparate policies in constrained conditions with embedded priorknowledge about each behaviour, that is scalable to any numberof behaviours, and prepares DRL methods for applications inthe real world

Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2011.00440 [cs.RO]
	(or arXiv:2011.00440v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2011.00440
Journal reference:	In proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021

Submission history

From: Brendan Tidd [view email]
[v1] Sun, 1 Nov 2020 06:34:42 UTC (2,693 KB)
[v2] Wed, 29 Sep 2021 13:34:16 UTC (2,710 KB)

Computer Science > Robotics

Title:Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators