Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Fujii, Keisuke; Tsutsui, Kazushi; Scott, Atom; Nakahara, Hiroshi; Takeishi, Naoya; Kawahara, Yoshinobu

Computer Science > Artificial Intelligence

arXiv:2305.13030 (cs)

[Submitted on 22 May 2023 (v1), last revised 19 Dec 2023 (this version, v4)]

Title:Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Authors:Keisuke Fujii, Kazushi Tsutsui, Atom Scott, Hiroshi Nakahara, Naoya Takeishi, Yoshinobu Kawahara

View PDF HTML (experimental)

Abstract:Modeling of real-world biological multi-agents is a fundamental problem in various scientific and engineering fields. Reinforcement learning (RL) is a powerful framework to generate flexible and diverse behaviors in cyberspace; however, when modeling real-world biological multi-agents, there is a domain gap between behaviors in the source (i.e., real-world data) and the target (i.e., cyberspace for RL), and the source environment parameters are usually unknown. In this paper, we propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios. We adopt an approach that combines RL and supervised learning by selecting actions of demonstrations in RL based on the minimum distance of dynamic time warping for utilizing the information of the unknown source dynamics. This approach can be easily applied to many existing neural network architectures and provide us with an RL model balanced between reproducibility as imitation and generalization ability to obtain rewards in cyberspace. In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the reproducibility and the generalization ability compared with the baselines. In particular, we used the tracking data of professional football players as expert demonstrations in football and show successful performances despite the larger gap between behaviors in the source and target environments than the chase-and-escape task.

Comments:	14 pages, 5 figures, accepted in ICAART 2024 Oral
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2305.13030 [cs.AI]
	(or arXiv:2305.13030v4 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2305.13030

Submission history

From: Keisuke Fujii [view email]
[v1] Mon, 22 May 2023 13:33:37 UTC (7,257 KB)
[v2] Sat, 27 May 2023 01:56:14 UTC (7,257 KB)
[v3] Fri, 15 Dec 2023 09:20:57 UTC (7,257 KB)
[v4] Tue, 19 Dec 2023 13:29:33 UTC (7,256 KB)

Computer Science > Artificial Intelligence

Title:Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators