Rate of Convergence and Error Bounds for LSTD($\lambda$)

Tagorti, Manel; Scherrer, Bruno

Computer Science > Machine Learning

arXiv:1405.3229 (cs)

[Submitted on 13 May 2014]

Title:Rate of Convergence and Error Bounds for LSTD($λ$)

Authors:Manel Tagorti (INRIA Nancy - Grand Est / LORIA), Bruno Scherrer (INRIA Nancy - Grand Est / LORIA)

View PDF

Abstract:We consider LSTD($\lambda$), the least-squares temporal-difference algorithm with eligibility traces algorithm proposed by Boyan (2002). It computes a linear approximation of the value function of a fixed policy in a large Markov Decision Process. Under a $\beta$-mixing assumption, we derive, for any value of $\lambda \in (0,1)$, a high-probability estimate of the rate of convergence of this algorithm to its limit. We deduce a high-probability bound on the error of this algorithm, that extends (and slightly improves) that derived by Lazaric et al. (2012) in the specific case where $\lambda=0$. In particular, our analysis sheds some light on the choice of $\lambda$ with respect to the quality of the chosen linear space and the number of samples, that complies with simulations.

Comments:	(2014)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC); Statistics Theory (math.ST)
Cite as:	arXiv:1405.3229 [cs.LG]
	(or arXiv:1405.3229v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1405.3229

Submission history

From: Bruno Scherrer [view email] [via CCSD proxy]
[v1] Tue, 13 May 2014 16:51:54 UTC (72 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2014-05

Change to browse by:

cs
cs.AI
math
math.OC
math.ST
stat
stat.TH

References & Citations

DBLP - CS Bibliography

listing | bibtex

Manel Tagorti
Bruno Scherrer

export BibTeX citation

Computer Science > Machine Learning

Title:Rate of Convergence and Error Bounds for LSTD($λ$)

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Rate of Convergence and Error Bounds for LSTD($λ$)

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators