A Seminar by Dr. Chandrashekar Lakshmi Narayanan
Speaker: Dr. Chandrashekar Lakshmi Narayanan.
Title: A Generalized Reduced Linear Program for Markov Decision Processes.
Date: 06th July, 2017 (Thursday)
Time: 11.00 am
Venue: Room No. 011, Ground Floor, IEOR Building.
Abstract: The efficient computation of near-optimal policies in Markov Decision Processes (MDPs) is of major interest in various scientific and engineering applications. One approach of major interest is to combine linear function approximation with linear programming, leading to what is known as "Approximate Linear Programming" (ALP). While ALP allows for a compact representation of value functions, the number of constraints in the standard ALP formulation is still intractable. One way to overcome this is to reduce the number of constraints. We provide a new analysis of a generalized version of the resulting reduced ALP that complements previous results.
In particular, as opposed to previous results, the new analysis allows us to derive a policy error bound that is applicable regardless of how the constraints are selected and shows a graceful degradation that connects features, the optimal value function and the constraints selected in an easy-to-interpret fashion, suggesting specific ways of reducing the constraints, while avoiding the knowledge of the stationary distribution of the optimal policy.
This joint work with Shalabh Bhatnagar and Csaba Szepesvari.