Improving Sample Complexity Bounds for Actor-Critic Algorithms

Xu, Tengyu; Wang, Zhe; Liang, Yingbin

Computer Science > Machine Learning

arXiv:2004.12956v1 (cs)

[Submitted on 27 Apr 2020 (this version), latest version 12 Feb 2021 (v4)]

Title:Improving Sample Complexity Bounds for Actor-Critic Algorithms

Authors:Tengyu Xu, Zhe Wang, Yingbin Liang

View PDF

Abstract:The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement learning. The finite-sample convergence rate for the AC and natural actor-critic (NAC) algorithms has been established recently, but under independent and identically distributed (i.i.d.) sampling and single-sample update at each iteration. In contrast, this paper characterizes the convergence rate and sample complexity of AC and NAC under Markovian sampling, with mini-batch data for each iteration, and with actor having general policy class approximation. We show that the overall sample complexity for a mini-batch AC to attain an $\epsilon$-accurate stationary point improves the best known sample complexity of AC by an order of $\mathcal{O}(\frac{1}{\epsilon}\log(\frac{1}{\epsilon}))$. We also show that the overall sample complexity for a mini-batch NAC to attain an $\epsilon$-accurate globally optimal point improves the known sample complexity of natural policy gradient (NPG) by $\mathcal{O}(\frac{1}{\epsilon}/\log(\frac{1}{\epsilon}))$. Our study develops several novel techniques for finite-sample analysis of RL algorithms including handling the bias error due to mini-batch Markovian sampling and exploiting the self variance reduction property to improve the convergence analysis of NAC.

Comments:	30 pages, 0 figure
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2004.12956 [cs.LG]
	(or arXiv:2004.12956v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2004.12956

Submission history

From: Tengyu Xu [view email]
[v1] Mon, 27 Apr 2020 17:11:06 UTC (123 KB)
[v2] Tue, 28 Apr 2020 15:20:38 UTC (123 KB)
[v3] Wed, 24 Jun 2020 17:23:59 UTC (49 KB)
[v4] Fri, 12 Feb 2021 01:00:43 UTC (49 KB)

Computer Science > Machine Learning

Title:Improving Sample Complexity Bounds for Actor-Critic Algorithms

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improving Sample Complexity Bounds for Actor-Critic Algorithms

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators