SIAM Journal on Control and Optimization, Vol.52, No.6, 3935-3966, 2014
RISK-AVERSE CONTROL OF UNDISCOUNTED TRANSIENT MARKOV MODELS
We use Markov risk measures to formulate a risk-averse version of the undiscounted total cost problem for a transient controlled Markov process. Using the new concept of a multikernel, we derive conditions for a system to be risk transient, that is, to have finite risk over an infinite time horizon. We derive risk-averse dynamic programming equations satisfied by the optimal policy and we describe methods for solving these equations. We illustrate the results on an optimal stopping problem and an organ transplantation problem.
Keywords:dynamic risk measures;Markov risk measures;multikernels;stochastic shortest path;optimal stopping;randomized policy