Course-ID: CS6700

Reinforcement Learning

Instructor: Prof. B. Ravindran

Description

The course will introduce students to the fundamental concepts of reinforcement learning. Students will learn to develop RL models and understand the intricacies in various aspects of the field.

Course Content

The Reinforcement Learning problem : evaluative feedback, non-associative learning, Rewards and returns, Markov Decision Processes, Value functions, optimality and approximation.
Dynamic programming : value iteration, policy iteration, asynchronous DP, generalized policy iteration.
Monte-Carlo methods : policy evaluation, roll outs, on policy and off policy learning, importance sampling.
Temporal Difference learning : TD prediction, Optimality of TD(0), SARSA, Q-learning, R-learning, Games and after states.
Eligibility traces : n-step TD prediction, TD (lambda), forward and backward views, Q (lambda), SARSA (lambda), replacing traces and accumulating traces.
Function Approximation : Value prediction, gradient descent methods, linear function approximation, ANN based function approximation, lazy learning, instability issues
Policy Gradient methods : non-associative learning, REINFORCE algorithm, exact gradient methods, estimating gradients, approximate policy gradient algorithms, actor-critic methods.
Planning and Learning : Model based learning and planning, prioritized sweeping, Dyna, heuristic search, trajectory sampling, E 3 algorithm
Hierarchical RL : MAXQ framework, Options framework, HAM framework, airport algorithm, hierarchical policy gradient
Case studies : Elevator dispatching, Samuelâs checker player, TDgammon, Acrobot, Helicopter piloting