Finite-Time Analysis of Temporal Difference Learning with Experience Replay

Lim, Han-Dong; Lee, Donghwan

Computer Science > Machine Learning

arXiv:2306.09746 (cs)

[Submitted on 16 Jun 2023 (v1), last revised 15 Apr 2025 (this version, v2)]

Title:Finite-Time Analysis of Temporal Difference Learning with Experience Replay

Authors:Han-Dong Lim, Donghwan Lee

View PDF HTML (experimental)

Abstract:Temporal-difference (TD) learning is widely regarded as one of the most popular algorithms in reinforcement learning (RL). Despite its widespread use, it has only been recently that researchers have begun to actively study its finite time behavior, including the finite time bound on mean squared error and sample complexity. On the empirical side, experience replay has been a key ingredient in the success of deep RL algorithms, but its theoretical effects on RL have yet to be fully understood. In this paper, we present a simple decomposition of the Markovian noise terms and provide finite-time error bounds for TD-learning with experience replay. Specifically, under the Markovian observation model, we demonstrate that for both the averaged iterate and final iterate cases, the error term induced by a constant step-size can be effectively controlled by the size of the replay buffer and the mini-batch sampled from the experience replay buffer.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.09746 [cs.LG]
	(or arXiv:2306.09746v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.09746

Submission history

From: Han-Dong Lim [view email]
[v1] Fri, 16 Jun 2023 10:25:43 UTC (185 KB)
[v2] Tue, 15 Apr 2025 04:59:42 UTC (199 KB)

Computer Science > Machine Learning

Title:Finite-Time Analysis of Temporal Difference Learning with Experience Replay

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Finite-Time Analysis of Temporal Difference Learning with Experience Replay

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators