IE 708: Markov Decision Processes

Prerequisite: IE 611 or equivalent or Instructor's consent


Overview of decision making in the context of stochastic systems evolving over time--examples. Framework and some types of cost criteria: Expected total cost, Discounted cost and Average cost. Finite horizon models; Some classes of policies; Optimality of Markov policies; Dynamic programming principle and algorithm.
Infinite horizon models: Stationary models, Adequacy of Markov policies. Discounted cost models: Optimality of Markov (pure) policies. Policy iteration, value iteration and modified policy iteration algorithms. Linear and convex programming formulations. Average cost models: Unichain and multichain models. Iterative algorithms. Linear and convex programming formulations. Expected total cost models: Positive and Negative models. Math programming formulations of optimal policies.
Learning algorithms: Q-learning algorithms, reinforcement learning algorithms, actor-critic algorithms.


