Intelligent Robot Learning Laboratory (IRL Lab) Transfer in Deep Reinforcement Learning

By: Yunshu DuGabriel V. de la Cruz Jr., James Irwin, and Matthew E. Taylor

As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure. [1]

[1] [pdf] Yunshu Du, Gabriel V. de la Cruz Jr., James Irwin, and Matthew E. Taylor. Initial Progress in Transfer for Deep Reinforcement Learning Algorithms. In Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at IJCAI), New York City, NY, USA, July 2016.
[Bibtex]
@inproceedings{2016DeepRL-Du,
author={Du, Yunshu and de la Cruz, Jr., Gabriel V. and Irwin, James and Taylor, Matthew E.},
title={{Initial Progress in Transfer for Deep Reinforcement Learning Algorithms}},
booktitle={{Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at {IJCAI})}},
year={2016},
address={New York City, NY, USA},
month={July},
bib2html_pubtype={Refereed Workshop or Symposium},
abstract={As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure.
}
}