Unified ODE Analysis of Smooth Q-Learning Algorithms

Lee, Donghwan

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2404

Computer Science > Machine Learning

Title: Unified ODE Analysis of Smooth Q-Learning Algorithms

Authors: Donghwan Lee

(Submitted on 20 Apr 2024 (v1), last revised 24 Apr 2024 (this version, v2))

Abstract: Convergence of Q-learning has been the focus of extensive research over the past several decades. Recently, an asymptotic convergence analysis for Q-learning was introduced using a switching system framework. This approach applies the so-called ordinary differential equation (ODE) approach to prove the convergence of the asynchronous Q-learning modeled as a continuous-time switching system, where notions from switching system theory are used to prove its asymptotic stability without using explicit Lyapunov arguments. However, to prove stability, restrictive conditions, such as quasi-monotonicity, must be satisfied for the underlying switching systems, which makes it hard to easily generalize the analysis method to other reinforcement learning algorithms, such as the smooth Q-learning variants. In this paper, we present a more general and unified convergence analysis that improves upon the switching system approach and can analyze Q-learning and its smooth variants. The proposed analysis is motivated by previous work on the convergence of synchronous Q-learning based on $p$-norm serving as a Lyapunov function. However, the proposed analysis addresses more general ODE models that can cover both asynchronous Q-learning and its smooth versions with simpler frameworks.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.14442 [cs.LG]
	(or arXiv:2404.14442v2 [cs.LG] for this version)

Submission history

From: Donghwan Lee [view email]
[v1] Sat, 20 Apr 2024 01:16:27 GMT (18kb)
[v2] Wed, 24 Apr 2024 04:22:51 GMT (18kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.14442

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Unified ODE Analysis of Smooth Q-Learning Algorithms

Submission history