Hierarchical Actor Critic¶
Actions space: Continuous
References: Hierarchical Reinforcement Learning with Hindsight
Network Structure¶
data:image/s3,"s3://crabby-images/c8b25/c8b25272f77daeb3405e587f29f3219738619270" alt="../../../_images/ddpg.png"
Algorithm Description¶
Choosing an action¶
Pass the current states through the actor network, and get an action mean vector \(\mu\). While in training phase, use a continuous exploration policy, such as the Ornstein-Uhlenbeck process, to add exploration noise to the action. When testing, use the mean vector \(\mu\) as-is.