In July 2020, along with my advisor, I initiated a reading group to understand the area of Multi-Armed Bandits. The discussions are largely adapted from the recently online book on Bandit Algorithms. Written notes for the meetings are provided below.

  • Lecture 1: Introduction to stochastic multi-armed (finite) bandits, explore-then-commit, UCB.

  • Lecture 2: Asymptotic optimality of UCB, MOSS, Adversarial bandit, and Exp3 algorithm.

  • Lecture 3: Exp3-IX algorithm.

  • Lecture 4: Contextual bandits, bandits with expert advice, Exp4 algorithm.

  • Lecture 5: Stochastic Linear bandits, LinUCB.

  • Lecture 6: Notebook explaining algorithms (under prep)

  • Lecture 7: Bandit PCA.