Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning
Paladyn. Journal of Behavioral Robotics,
Volume 10,
Number 1,
pages 14--29,
doi: 10.1515/pjbr-2019-0005
- Jan 2019
In this paper, we present a new intrinsically motivated actor-critic algorithm for learning continuous motor skills directly from raw visual input. Our neural architecture is composed of a critic and an actor network. Both
networks receive the hidden representation of a deep convolutional autoencoder which is trained to reconstruct the
visual input, while the centre-most hidden representation
is also optimized to estimate the state value. Separately,
an ensemble of predictive world models generates, based
on its learning progress, an intrinsic reward signal which
is combined with the extrinsic reward to guide the exploration of the actor-critic learner. Our approach is more
data-efficient and inherently more stable than the existing actor-critic methods for continuous control from pixel
data. We evaluate our algorithm for the task of learning
robotic reaching and grasping skills on a realistic physics
simulator and on a humanoid robot. The results show that
the control policies learned with our approach can achieve
better performance than the compared state-of-the-art and
baseline algorithms in both dense-reward and challenging
sparse-reward settings.
@Article{HWKW19, author = {Hafez, Burhan and Weber, Cornelius and Kerzel, Matthias and Wermter, Stefan}, title = {Deep Intrinsically Motivated Continuous Actor-Critic for Efficient Robotic Visuomotor Skill Learning}, journal = {Paladyn. Journal of Behavioral Robotics}, number = {1}, volume = {10}, pages = {14--29}, year = {2019}, month = {Jan}, publisher = {De Gruyter}, doi = {10.1515/pjbr-2019-0005}, }