Automatica, Vol.39, No.11, 1947-1955, 2003
Optimal hysteresis for a class of deterministic deteriorating two-armed Bandit problem with switching costs
We derive the optimal policy for the dynamic scheduling of a class of deterministic, deteriorating, continuous time and continuous state two-armed Bandit problems with switching costs. Due to the presence of switching costs, the scheduling policy exhibits an hysteretic character. Using this exactly solvable class of models, we are able to explicitly observe the performance of a sub-optimal policy derived from a set of generalized priority indices (generalized Gittins' indices) similar to those first introduced in a contribution of Asawa and Teneketzis (IEE Trans. Automat. Control 41 (1996) 328). (C) 2003 Elsevier Ltd. All rights reserved.
Keywords:multi-armed Bandit process;switching costs;optimal switching curves;hysteretic policy;priority index policy