Abstract
We study four proofs that the Gittins index priority rule is optimal for alternative bandit processes. These include Gittins’ original exchange argument, Weber’s prevailing charge argument, Whittle’s Lagrangian dual approach, and Bertsimas and Niño-Mora’s proof based on the achievable region approach and generalized conservation laws. We extend the achievable region proof to infinite countable state spaces, by using infinite dimensional linear programming theory.
Original language | English |
---|---|
Pages (from-to) | 127-165 |
Number of pages | 39 |
Journal | Annals of Operations Research |
Volume | 241 |
Issue number | 1-2 |
DOIs | |
State | Published - 1 Jun 2016 |
Bibliographical note
Funding Information:G. Weiss’s research supported in part by Israel Science Foundation Grants 249/02, 454/05, 711/09 and 286/13.
Funding Information:
E. Frostig’s research supported in part by Network of Excellence Euro-NGI.
Publisher Copyright:
© 2014, Springer Science+Business Media New York.
Keywords
- Bandit problems
- Dynamic programming
- Gittins index
- Linear programming
ASJC Scopus subject areas
- Decision Sciences (all)
- Management Science and Operations Research