Prediction and Control with Function Approximation
- 4.8
Course Summary
This course teaches you how to make predictions and control functions using function approximation. You will learn how to use the techniques of linear regression, decision trees, and neural networks to make predictions and control functions.Key Learning Points
- Learn how to use function approximation to make predictions and control functions
- Master the techniques of linear regression, decision trees, and neural networks
- Apply your knowledge to real-world problems
Related Topics for further study
Learning Outcomes
- Understand the fundamentals of function approximation
- Be able to apply linear regression, decision trees, and neural networks to real-world problems
- Master the techniques of prediction and control functions using function approximation
Prerequisites or good to have knowledge before taking this course
- Basic knowledge of programming and statistics
- Familiarity with Python programming language
Course Difficulty Level
IntermediateCourse Format
- Online Self-paced Course
- Video Lectures
- Assignments and Quizzes
Similar Courses
- Applied Machine Learning
- Data Science Essentials
- Machine Learning for Business Professionals
Related Education Paths
- Applied Data Science with Python Specialization
- Machine Learning for Everyone (Google Cloud) Specialization
- Data Science Specialization
Notable People in This Field
- Andrew Ng
- Deepti Sharma
- Hugo Bowne-Anderson
Related Books
Description
In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.
Outline
- Welcome to the Course!
- Course 3 Introduction
- Meet your instructors!
- Read Me: Pre-requisites and Learning Objectives
- Reinforcement Learning Textbook
- On-policy Prediction with Approximation
- Moving to Parameterized Functions
- Generalization and Discrimination
- Framing Value Estimation as Supervised Learning
- The Value Error Objective
- Introducing Gradient Descent
- Gradient Monte for Policy Evaluation
- State Aggregation with Monte Carlo
- Semi-Gradient TD for Policy Evaluation
- Comparing TD and Monte Carlo with State Aggregation
- Doina Precup: Building Knowledge for AI Agents with Reinforcement Learning
- The Linear TD Update
- The True Objective for TD
- Week 1 Summary
- Module 1 Learning Objectives
- Weekly Reading: On-policy Prediction with Approximation
- On-policy Prediction with Approximation
- Constructing Features for Prediction
- Coarse Coding
- Generalization Properties of Coarse Coding
- Tile Coding
- Using Tile Coding in TD
- What is a Neural Network?
- Non-linear Approximation with Neural Networks
- Deep Neural Networks
- Gradient Descent for Training Neural Networks
- Optimization Strategies for NNs
- David Silver on Deep Learning + RL = AI?
- Week 2 Review
- Module 2 Learning Objectives
- Weekly Reading: On-policy Prediction with Approximation II
- Constructing Features for Prediction
- Control with Approximation
- Episodic Sarsa with Function Approximation
- Episodic Sarsa in Mountain Car
- Expected Sarsa with Function Approximation
- Exploration under Function Approximation
- Average Reward: A New Way of Formulating Control Problems
- Satinder Singh on Intrinsic Rewards
- Week 3 Review
- Module 3 Learning Objectives
- Weekly Reading: On-policy Control with Approximation
- Control with Approximation
- Policy Gradient
- Learning Policies Directly
- Advantages of Policy Parameterization
- The Objective for Learning Policies
- The Policy Gradient Theorem
- Estimating the Policy Gradient
- Actor-Critic Algorithm
- Actor-Critic with Softmax Policies
- Demonstration with Actor-Critic
- Gaussian Policies for Continuous Actions
- Week 4 Summary
- Congratulations! Course 4 Preview
- Module 4 Learning Objectives
- Weekly Reading: Policy Gradient Methods
- Policy Gradient Methods