Industrial & Engineering Chemistry Research, Vol.59, No.40, 17987-17999, 2020
Q-Learning-Based Model Predictive Control for Nonlinear Continuous-Time Systems
In this paper, a Q-learning-based model predictive control using the Lyapunov technique (Q-LMPC) is proposed for the control of a class of continuous nonlinear systems with complicated dynamics, whose accurate mathematical model is hard or more efforts are needed to be obtained. The proposed method learns an approximate control policy of Lyapunov-based model predictive control (MPC) based on data, and the learned control policy is approximated by a neural network, called actor network, which guarantees the computational efficiency of control law regardless of the complexity of system dynamics. In the proposed MPC, a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions of a nonlinear optimization problem with Lyapunov constraints in the idle time of the controller. In the meantime, the critic and actor neural networks used for the implementation of MPC are updated in an iterative manner according to these solutions. The convergence of the iterative Q-LMPC method and the Lyapunov stability of the closed-loop control system are analyzed. The simulation result shows the effectiveness of the proposed Q-LMPC method.