Continuous-time Markov decision processes with nth-bias optimality criteria

Zhang JY; Cao XR

Automatica, Vol.45, No.7, 1628-1638, 2009

DOI10.1016/j.automatica.2009.03.009 Export Citation

Continuous-time Markov decision processes with nth-bias optimality criteria

In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias optimal policy iteration algorithms, and show that Such an nth-bias optimal policy can be obtained in a finite number of policy iterations. (C) 2009 Elsevier Ltd. All rights reserved.

Keywords:Continuous-time systems;Markov decision processes;Multichain model;nth-bias optimality criteria;Policy iteration algorithms;Performance analysis;Sensitivity analysis