An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Ke, Zhifa; Wen, Zaiwen; Zhang, Junyu

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2405

Computer Science > Machine Learning

Title: An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Authors: Zhifa Ke, Zaiwen Wen, Junyu Zhang

(Submitted on 7 May 2024)

Abstract: Temporal difference (TD) learning algorithms with neural network function parameterization have well-established empirical success in many practical large-scale reinforcement learning tasks. However, theoretical understanding of these algorithms remains challenging due to the nonlinearity of the action-value approximation. In this paper, we develop an improved non-asymptotic analysis of the neural TD method with a general $L$-layer neural network. New proof techniques are developed and an improved new $\tilde{\mathcal{O}}(\epsilon^{-1})$ sample complexity is derived. To our best knowledge, this is the first finite-time analysis of neural TD that achieves an $\tilde{\mathcal{O}}(\epsilon^{-1})$ complexity under the Markovian sampling, as opposed to the best known $\tilde{\mathcal{O}}(\epsilon^{-2})$ complexity in the existing literature.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
Cite as:	arXiv:2405.04017 [cs.LG]
	(or arXiv:2405.04017v1 [cs.LG] for this version)

Submission history

From: Zhifa Ke [view email]
[v1] Tue, 7 May 2024 05:29:55 GMT (395kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.04017

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Computer Science > Machine Learning

Title: An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Submission history