IE 620: Reinforcement Learning Algorithms

Prerequisite: IE 708 (Markov decision Processes) or Instructor's consent.

Course Content:

Reinforcement Learning (RL) as a data driven framework for sequential decision making problems (Markov decision models).
Model-free and model-based algorithms; temporal differences; TD (lambda) algorithms; Q-learning and its variants; discounted and average cost models.
Value and policy based approaches for discounted and average cost models; actor-critic algorithms with linear function approximations; their convergence via two-time scale stochastic approximation approach; natural gradient based algorithms.
Multi-Agent RL for discounted and average cost models; non-cooperative models; minimax criteria; decentralised models; consensus matrix; actor-critic algorithms with linear function approximations.
Learning stochastic shortest path problems and their multi-agent versions; regret analysis of algorithms.
Deep Reinforcement Learning algorithms for single and multi-agent models.

Reference:

1) Stefano V. Albrecht, Filippos Christianos and Lukas Schäfer, Multi-Agent Reinforcement Learning: Foundations and Modern Approaches, 2023, MIT Press, Cambridge, USA
2) Dimitri Bertsekas, A Course in Reinforcement Learning, 2023, Athena Scientific, USA
3) Sean Meyn, Control Systems and Reinforcement Learning, 2021, Cambridge University Press, UK
4) Vivek Borkar, Stochastic Approximation: A Dynamical System Viewpoint, Second Edition, TRIM Series, Springer, 2023
5) Richard Sutton and Andrew Barto, Reinforcement Learning, Second Edition, Springer, 2018
6) Open Literature

औद्योगिक अभियांत्रिकी एवं प्रचालन अनुसंधान विभाग

DEPARTMENT OF
INDUSTRIAL ENGINEERING & OPERATIONS RESEARCH
IIT BOMBAY

IE 620: Reinforcement Learning Algorithms

औद्योगिक अभियांत्रिकी एवं प्रचालन अनुसंधान विभाग

DEPARTMENT OF INDUSTRIAL ENGINEERING & OPERATIONS RESEARCH IIT BOMBAY

IE 620: Reinforcement Learning Algorithms

DEPARTMENT OF
INDUSTRIAL ENGINEERING & OPERATIONS RESEARCH
IIT BOMBAY