Prerequisite: IE 609 (or equivalent) and IE621 (or equilvalent)
Many systems involve actions being applied sequentially, with each action resulting in a certain reward which is unknown apriori. The goal is to identify an action, from a set of available actions, that results in maximum reward (or minimum cost) in each round. Such settings are referred to as online learning problems. In this course, we will study algorithms for various online learning settings and analyze their performance.
- T. Hastie, R. Tibshirani and J. Friedman, ``Elements of statistical machine learning,'' Springer series in statistics, 2009
- S. Shalev-Shwartz and S. Ben-David, ``Understanding Machine Learning: From Theory to Algorithms,'' Cambridge University Press, 2014
- S. Shalev-Shwartz, Foundation and Trends in machine learning, ``Online Learning and Online Convex Optimization,'' NOW publisher, 2012
- S. Bubeck and N. Cesa-Bianchi, ``Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems,'' Foundation and Trends in machine learning, NOW publisher, 2012
- N. Cesa-Bianchi and G. Lugosi, ``Prediction, Learning, and Games, Cambridge University Press, 2006