Multi-Armed Restless Bandits, Index Policies and Dynamic Priority Allocation Articles
Overview
published in
publication date
- January 2010
start page
- 124
end page
- 133
issue
- 2
volume
- 26
International Standard Serial Number (ISSN)
- 1699-8871
abstract
-
This paper presents a brief introduction to the emerging research field of multi-armed restless bandits (MARBs), which substantially extend the modeling power of classic multi-armed bandits. MARBs are Markov decision
process models for optimal dynamic priority allocation to a collection
of stochastic binary-action (active/passive) projects evolving over
time. Interest in MARBs has grown steadily, spurred by the breadth of
their possible applications. Although MARBs are generally intractable, a
Lagrangian relaxation and decomposition approach yields a unifying
design principle for heuristic priority-index policies, which are often
tractable and nearly optimal, along with an upper bound on the optimal
reward.