Q-learning for POMDP: An application to learning locomotion gaits

Wang, Tixian; Taghvaei, Amirhossein; Mehta, Prashant G.

Full-text links:

Download:

Current browse context:

eess.SY

< prev | next >

new | recent | 1910

Electrical Engineering and Systems Science > Systems and Control

Title: Q-learning for POMDP: An application to learning locomotion gaits

Authors: Tixian Wang, Amirhossein Taghvaei, Prashant G. Mehta

(Submitted on 30 Sep 2019)

Abstract: This paper presents a Q-learning framework for learning optimal locomotion gaits in robotic systems modeled as coupled rigid bodies. Inspired by prevalence of periodic gaits in bio-locomotion, an open loop periodic input is assumed to (say) affect a nominal gait. The learning problem is to learn a new (modified) gait by using only partial noisy measurements of the state. The objective of learning is to maximize a given reward modeled as an objective function in optimal control settings. The proposed control architecture has three main components: (i) Phase modeling of dynamics by a single phase variable; (ii) A coupled oscillator feedback particle filter to represent the posterior distribution of the phase conditioned in the sensory measurements; and (iii) A Q-learning algorithm to learn the approximate optimal control law. The architecture is illustrated with the aid of a planar two-body system. The performance of the learning is demonstrated in a simulation environment.

Comments:	8 pages, 6 figures, 58th IEEE Conference on Decision and Control
Subjects:	Systems and Control (eess.SY)
Cite as:	arXiv:1910.00107 [eess.SY]
	(or arXiv:1910.00107v1 [eess.SY] for this version)

Submission history

From: Tixian Wang [view email]
[v1] Mon, 30 Sep 2019 21:06:46 GMT (1468kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:1910.00107

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Systems and Control

Title: Q-learning for POMDP: An application to learning locomotion gaits

Submission history