Prerequisite: IE 708 (Markov decision Processes) or Instructor's consent.
Course Content:
Reinforcement Learning (RL) as a data driven framework for sequential decision making problems (Markov decision models).
Model-free and model-based algorithms; temporal differences; TD (lambda) algorithms; Q-learning and its variants; discounted and average cost models.
Value and policy based approaches for discounted and average cost models; actor-critic algorithms with linear function approximations; their convergence via two-time scale stochastic approximation approach; natural gradient based algorithms.
Multi-Agent RL for discounted and average cost models; non-cooperative models; minimax criteria; decentralised models; consensus matrix; actor-critic algorithms with linear function approximations.
Learning stochastic shortest path problems and their multi-agent versions; regret analysis of algorithms.
Deep Reinforcement Learning algorithms for single and multi-agent models.
Reference:
1) Stefano V. Albrecht, Filippos Christianos and Lukas Schäfer, Multi-Agent Reinforcement Learning: Foundations and Modern Approaches, 2023, MIT Press, Cambridge, USA
2) Dimitri Bertsekas, A Course in Reinforcement Learning, 2023, Athena Scientific, USA
3) Sean Meyn, Control Systems and Reinforcement Learning, 2021, Cambridge University Press, UK
4) Vivek Borkar, Stochastic Approximation: A Dynamical System Viewpoint, Second Edition, TRIM Series, Springer, 2023
5) Richard Sutton and Andrew Barto, Reinforcement Learning, Second Edition, Springer, 2018
6) Open Literature