References & Citations
Computer Science > Machine Learning
Title: Unified ODE Analysis of Smooth Q-Learning Algorithms
(Submitted on 20 Apr 2024 (v1), last revised 24 Apr 2024 (this version, v2))
Abstract: Convergence of Q-learning has been the focus of extensive research over the past several decades. Recently, an asymptotic convergence analysis for Q-learning was introduced using a switching system framework. This approach applies the so-called ordinary differential equation (ODE) approach to prove the convergence of the asynchronous Q-learning modeled as a continuous-time switching system, where notions from switching system theory are used to prove its asymptotic stability without using explicit Lyapunov arguments. However, to prove stability, restrictive conditions, such as quasi-monotonicity, must be satisfied for the underlying switching systems, which makes it hard to easily generalize the analysis method to other reinforcement learning algorithms, such as the smooth Q-learning variants. In this paper, we present a more general and unified convergence analysis that improves upon the switching system approach and can analyze Q-learning and its smooth variants. The proposed analysis is motivated by previous work on the convergence of synchronous Q-learning based on $p$-norm serving as a Lyapunov function. However, the proposed analysis addresses more general ODE models that can cover both asynchronous Q-learning and its smooth versions with simpler frameworks.
Submission history
From: Donghwan Lee [view email][v1] Sat, 20 Apr 2024 01:16:27 GMT (18kb)
[v2] Wed, 24 Apr 2024 04:22:51 GMT (18kb)
Link back to: arXiv, form interface, contact.