Module 3 Learning Objectives

By the end of this lesson, you should have achieved the following learning objectives:

Lesson 1: Policies and Value Functions

Recognize that a policy is a distribution over actions for each possible state
Describe the similarities and differences between stochastic and deterministic policies
Identify the characteristics of a well-defined policy
Generate examples of valid policies for a given MDP
Describe the roles of state-value and action-value functions in reinforcement learning
Describe the relationship between value functions and policies
Create examples of valid value functions for a given MDP

Lesson 2: Bellman Equations

Lesson 3: Optimality (Optimal Policies & Value Functions)

Define an optimal policy
Understand how a policy can be at least as good as every other policy in every state
Identify an optimal policy for given MDPs
Derive the Bellman optimality equation for state-value functions