In this intermediate deep learning tutorial, you will learn how to go from reading a paper on deep deterministic policy gradients to implementing the concepts in Tensorflow.
In this intermediate deep learning tutorial, you will learn how to go from reading a paper on deep deterministic policy gradients to implementing the concepts in Tensorflow. This process can be applied to any deep learning paper, not just deep reinforcement learning. In the second part, you will learn how to code a deep deterministic policy gradient (DDPG) agent using Python and PyTorch, to beat the continuous lunar lander environment (a classic machine learning problem). DDPG combines the best of Deep Q Learning and Actor Critic Methods into an algorithm that can solve environments with continuous action spaces. We will have an actor network that learns the (deterministic) policy, coupled with a critic network to learn the action-value functions. We will make use of a replay buffer to maximize sample efficiency, as well as target networks to assist in algorithm convergence and stability.
@ 2020 Copyright - Global Human Peace University | All Rights Reserved. Developed by hyrrokkin