Four proofs of Gittins’ multiarmed bandit theorem

Research output: Contribution to journalArticlepeer-review

Abstract

We study four proofs that the Gittins index priority rule is optimal for alternative bandit processes. These include Gittins’ original exchange argument, Weber’s prevailing charge argument, Whittle’s Lagrangian dual approach, and Bertsimas and Niño-Mora’s proof based on the achievable region approach and generalized conservation laws. We extend the achievable region proof to infinite countable state spaces, by using infinite dimensional linear programming theory.

Original languageEnglish
Pages (from-to)127-165
Number of pages39
JournalAnnals of Operations Research
Volume241
Issue number1-2
DOIs
StatePublished - 1 Jun 2016

Bibliographical note

Publisher Copyright:
© 2014, Springer Science+Business Media New York.

Keywords

  • Bandit problems
  • Dynamic programming
  • Gittins index
  • Linear programming

ASJC Scopus subject areas

  • General Decision Sciences
  • Management Science and Operations Research

Fingerprint

Dive into the research topics of 'Four proofs of Gittins’ multiarmed bandit theorem'. Together they form a unique fingerprint.

Cite this