IEEE Transactions on Automatic Control, Vol.66, No.1, 121-136, 2021
On Passivity, Reinforcement Learning, and Higher Order Learning in Multiagent Finite Games
In this article, we propose a passivity-based methodology for the analysis and design of reinforcement learning dynamics and algorithms in multiagent finite games. Starting from a known, first-order reinforcement learning scheme, we show that convergence to a Nash distribution can be attained in a broader class of games than previously considered in the literature-namely, in games characterized by the monotonicity property of their (negative) payoff vectors. We further exploit passivity techniques to design a class of higher order learning schemes that preserve the convergence properties of their first-order counterparts. Moreover, we show that the higher order schemes improve upon the rate of convergence and can even achieve convergence where the first-order scheme fails. We demonstrate these properties through numerical simulations for several representative games.
Keywords:Games;Convergence;Learning (artificial intelligence);Heuristic algorithms;Sociology;Statistics;Radio frequency;Agents and autonomous systems;game theory;nonlinear systems;passivity;reinforcement learning