Multi-Armed Restless Bandits, Index Policies and Dynamic Priority Allocation Articles

Overview

authors

NIÑO MORA, JOSE

published in

Boletín de la Sociedad de Estadística e Investigación Operativa Journal

publication date

January 2010

start page

124

end page

133

issue

2

volume

26

International Standard Serial Number (ISSN)

1699-8871

abstract

This paper presents a brief introduction to the emerging research field of multi-armed restless bandits (MARBs), which substantially extend the modeling power of classic multi-armed bandits. MARBs are Markov decision
process models for optimal dynamic priority allocation to a collection
of stochastic binary-action (active/passive) projects evolving over
time. Interest in MARBs has grown steadily, spurred by the breadth of
their possible applications. Although MARBs are generally intractable, a
Lagrangian relaxation and decomposition approach yields a unifying
design principle for heuristic priority-index policies, which are often
tractable and nearly optimal, along with an upper bound on the optimal
reward.