From Past to Future: Rethinking Eligibility Traces

Gupta, Dhawal; Jordan, Scott M.; Chaudhari, Shreyas; Liu, Bo; Thomas, Philip S.; da Silva, Bruno Castro

Computer Science > Machine Learning

arXiv:2312.12972 (cs)

[Submitted on 20 Dec 2023]

Title:From Past to Future: Rethinking Eligibility Traces

Authors:Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva

View PDF HTML (experimental)

Abstract:In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment to preceding states. From this investigation emerges the concept of a novel value function, which we refer to as the \emph{bidirectional value function}. Unlike traditional state value functions, bidirectional value functions account for both future expected returns (rewards anticipated from the current state onward) and past expected returns (cumulative rewards from the episode's start to the present). We derive principled update equations to learn this value function and, through experimentation, demonstrate its efficacy in enhancing the process of policy evaluation. In particular, our results indicate that the proposed learning approach can, in certain challenging contexts, perform policy evaluation more rapidly than TD($\lambda$) -- a method that learns forward value functions, $v^\pi$, \emph{directly}. Overall, our findings present a new perspective on eligibility traces and potential advantages associated with the novel value function it inspires, especially for policy evaluation.

Comments:	Accepted in The 38th Annual AAAI Conference on Artificial Intelligence
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2312.12972 [cs.LG]
	(or arXiv:2312.12972v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.12972

Submission history

From: Dhawal Gupta [view email]
[v1] Wed, 20 Dec 2023 12:23:30 UTC (18,423 KB)

Computer Science > Machine Learning

Title:From Past to Future: Rethinking Eligibility Traces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:From Past to Future: Rethinking Eligibility Traces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators