SIAM Journal on Control and Optimization, Vol.48, No.3, 1405-1421, 2009
ASYMPTOTICALLY OPTIMAL STRATEGIES FOR ADAPTIVE ZERO-SUM DISCOUNTED MARKOV GAMES
We consider a class of discrete-time two person zero-sum Markov games with Borel state and action spaces, and possibly unbounded payoffs. The game evolves according to the recursive equation x(n+1) = F(x(n), a(n), b(n),xi(n)), n = 0, 1, ... , where the disturbance process {xi(n)} is formed by independent and identically distributed R(k)-valued random vectors, which are observable but whose common density rho is unknown to both players. Under certain continuity and compactness conditions, we combine a nonstationary iteration procedure and suitable density estimation methods to construct asymptotically discounted optimal strategies for both players.