Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation

Dai, Tianhong; Arulkumaran, Kai; Gerbert, Tamara; Tukra, Samyakh; Behbahani, Feryal; Bharath, Anil Anthony

doi:10.1016/j.neucom.2022.04.005

Computer Science > Machine Learning

arXiv:1912.08324 (cs)

[Submitted on 18 Dec 2019 (v1), last revised 17 Feb 2020 (this version, v2)]

Title:Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation

Authors:Tianhong Dai, Kai Arulkumaran, Tamara Gerbert, Samyakh Tukra, Feryal Behbahani, Anil Anthony Bharath

View PDF

Abstract:Deep reinforcement learning has the potential to train robots to perform complex tasks in the real world without requiring accurate models of the robot or its environment. A practical approach is to train agents in simulation, and then transfer them to the real world. One popular method for achieving transferability is to use domain randomisation, which involves randomly perturbing various aspects of a simulated environment in order to make trained agents robust to the reality gap. However, less work has gone into understanding such agents - which are deployed in the real world - beyond task performance. In this work we examine such agents, through qualitative and quantitative comparisons between agents trained with and without visual domain randomisation. We train agents for Fetch and Jaco robots on a visuomotor control task and evaluate how well they generalise using different testing conditions. Finally, we investigate the internals of the trained agents by using a suite of interpretability techniques. Our results show that the primary outcome of domain randomisation is more robust, entangled representations, accompanied with larger weights with greater spatial structure; moreover, the types of changes are heavily influenced by the task setup and presence of additional proprioceptive inputs. Additionally, we demonstrate that our domain randomised agents require higher sample complexity, can overfit and more heavily rely on recurrent processing. Furthermore, even with an improved saliency method introduced in this work, we show that qualitative studies may not always correspond with quantitative measures, necessitating the combination of inspection tools in order to provide sufficient insights into the behaviour of trained agents.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
Cite as:	arXiv:1912.08324 [cs.LG]
	(or arXiv:1912.08324v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1912.08324
Related DOI:	https://doi.org/10.1016/j.neucom.2022.04.005

Submission history

From: Kai Arulkumaran [view email]
[v1] Wed, 18 Dec 2019 00:18:17 UTC (10,492 KB)
[v2] Mon, 17 Feb 2020 20:53:30 UTC (13,691 KB)

Computer Science > Machine Learning

Title:Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators