Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

Long, Alexander; Blair, Alan; van Hoof, Herke

Computer Science > Machine Learning

arXiv:2203.03078 (cs)

[Submitted on 7 Mar 2022]

Title:Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

Authors:Alexander Long, Alan Blair, Herke van Hoof

View PDF

Abstract:We present Nonparametric Approximation of Inter-Trace returns (NAIT), a Reinforcement Learning algorithm for discrete action, pixel-based environments that is both highly sample and computation efficient. NAIT is a lazy-learning approach with an update that is equivalent to episodic Monte-Carlo on episode completion, but that allows the stable incorporation of rewards while an episode is ongoing. We make use of a fixed domain-agnostic representation, simple distance based exploration and a proximity graph-based lookup to facilitate extremely fast execution. We empirically evaluate NAIT on both the 26 and 57 game variants of ATARI100k where, despite its simplicity, it achieves competitive performance in the online setting with greater than 100x speedup in wall-time.

Comments:	AAAI2022
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2203.03078 [cs.LG]
	(or arXiv:2203.03078v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2203.03078

Submission history

From: Alexander Long [view email]
[v1] Mon, 7 Mar 2022 00:31:31 UTC (1,486 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2022-03

Change to browse by:

cs.AI
cs.LG

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators