Intelligent Robot Learning Laboratory (IRL Lab) Gabriel V. de la Cruz Jr.

CONTACT INFORMATION:
Gabriel V. de la Cruz Jr.

PhD Student, Computer Science
Email: gabriel.delacruz@wsu.edu
Office: Dana Hall 3
Links: Personal Website, Google Scholar Page


My Story

My passion for education and robotics are what has drawn me to pursue a graduate degree. Having been raised in a lower-middle class family in the Philippines, and with a great number of Filipinos under the poverty line, it is common for us to be reminded by our parents that aside from working hard that we must strive in having a better education in order to succeed in life. This is where my passion for teaching emanates that I want to be able to help others succeed. Right now, my goal is to work in the industry first and use that experience to then go back to the academe and help students discover their talents and fulfill their goals.

On the other hand, my interests with robotics stemmed from several experiences in my active duty years in the military. I served as a Hospital Corpsman in the US Navy for over 6 years. Although in the Navy, I served most of my military years working with the US Marine Corps. I have been to 3 combat deployments in both Iraq and Afghanistan. I would see my Marines go walk in front of our military vehicles and use a metal detector to screen for explosives. And I would always tell myself, “I wish a robot could do that!” This is just one of the many applications of robots that will benefit our service members in the front lines.

But besides the military, I see the benefits of robots to be applicable in so many areas. Although robots are now heavily used in the industry but robots are still underutilized elsewhere. I just don’t want to be a part of the next big shift in technology where robots are being use as part of everyone’s day to day activities, but I want to be able to contribute to its realization.

Research Interests

My research interests involves topics in Reinforcement Learning, Transfer Learning, and Intelligent Multi-agent Systems, and their application in the field of robotics. I’m advised by Dr. Matthew Taylor.

Current Projects

By: Yunshu DuGabriel V. de la Cruz Jr., James Irwin, and Matthew E. Taylor

As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure. [1]

[1] [pdf] Yunshu Du, Gabriel V. de la Cruz Jr., James Irwin, and Matthew E. Taylor. Initial Progress in Transfer for Deep Reinforcement Learning Algorithms. In Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at IJCAI), New York City, NY, USA, July 2016.
[Bibtex]
@inproceedings{2016DeepRL-Du,
author={Du, Yunshu and de la Cruz, Jr., Gabriel V. and Irwin, James and Taylor, Matthew E.},
title={{Initial Progress in Transfer for Deep Reinforcement Learning Algorithms}},
booktitle={{Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at {IJCAI})}},
year={2016},
address={New York City, NY, USA},
month={July},
bib2html_pubtype={Refereed Workshop or Symposium},
abstract={As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure.
}
}

By: Gabriel V. de la Cruz Jr., James M. Irwin, and Matthew E. Taylor

Undergraduates: Brandon Kallaher (WSU)

This is a joint project of WSU, University of Pennsylvania and Olin College. This project is about developing transfer learning methods that enable teams of heterogenous agents to rapidly adapt control and coordination policies to new scenarios. Our approach uses a combination of lifelong transfer learning and autonomous instruction to support continual transfer among heterogeneous agents and across diverse tasks. The resulting multi-agent system will accumulate transferrable knowledge over consecutive tasks, enabling the transfer learning process to improve overtime and the system to become increasingly versatile. We will apply these methods to sequential decision making (SDM) tasks in dynamic environments with aerial and ground robots. [1, 2]

[1] [pdf] David Isele, José Marcio Luna, Eric Eaton, Gabriel V. de la Cruz Jr., James Irwin, Brandon Kallaher, and Matthew E. Taylor. Lifelong Learning for Disturbance Rejection on Mobile Robots. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2016. 48% acceptance rate
[Bibtex]
@inproceedings{2016IROS-Isele,
author={Isele, David and Luna, Jos\'e Marcio and Eaton, Eric and de la Cruz, Jr., Gabriel V. and Irwin, James and Kallaher, Brandon and Taylor, Matthew E.},
title={{Lifelong Learning for Disturbance Rejection on Mobile Robots}},
booktitle={{Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems ({IROS})}},
month={October},
year={2016},
note={48% acceptance rate},
video={https://youtu.be/u7pkhLx0FQ0},
bib2html_pubtype={Refereed Conference},
abstract={No two robots are exactly the same—even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Furthermore, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled.}
}
[2] [pdf] David Isele, José Marcio Luna, Eric Eaton, Gabriel V. de la Cruz Jr., James Irwin, Brandon Kallaher, and Matthew E. Taylor. Work in Progress: Lifelong Learning for Disturbance Rejection on Mobile Robots. In Proceedings of the Adaptive Learning Agents (ALA) workshop (at AAMAS), Singapore, May 2016.
[Bibtex]
@inproceedings{2016ALA-Isele,
author={Isele, David and Luna, Jos\'e Marcio and Eaton, Eric and de la Cruz, Jr., Gabriel V. and Irwin, James and Kallaher, Brandon and Taylor, Matthew E.},
title={{Work in Progress: Lifelong Learning for Disturbance Rejection on Mobile Robots}},
booktitle={{Proceedings of the Adaptive Learning Agents ({ALA}) workshop (at {AAMAS})}},
year={2016},
address={Singapore},
month={May},
abstract = {No two robots are exactly the same — even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Further, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled. These preliminary results are an initial step towards learning robust fault-tolerant control for arbitrary robots.}
}

By: Gabriel V. de la Cruz Jr.Bei Peng, and Matthew E. Taylor

Reinforcement learning suffers from poor initial performance. Our approach uses crowdsourcing to provide non-expert suggestions to speed up learning of an RL agent. Currently, we are using Mrs. Pac-Man as our application domain for its popularity as a game. From our studies, we have already concluded that crowdsourcing, although non-experts, are good in identifying mistakes. We are now working on how we can integrate the crowd’s advice to speed up the RL agent’s learning. In the future, we intend to implement this approach to a physical robot. [1, 2]

[1] [pdf] [doi] Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. Towards Integrating Real-Time Crowd Advice with Reinforcement Learning. In The 20th ACM Conference on Intelligent User Interfaces (IUI), March 2015. Poster: 41% acceptance rate for poster submissions
[Bibtex]
@inproceedings{2015IUI-Delacruz,
author={de la Cruz, Jr., Gabriel V. and Peng, Bei and Lasecki, Walter S. and Taylor, Matthew E.},
title={{Towards Integrating Real-Time Crowd Advice with Reinforcement Learning}},
booktitle={{The 20th {ACM} Conference on Intelligent User Interfaces ({IUI})}},
month={March},
year={2015},
doi={10.1145/2732158.2732180},
note={Poster: 41% acceptance rate for poster submissions},
wwwnote={<a href="http://iui.acm.org/2015/">ACM iUI-15</a>},
bib2html_rescat={Reinforcement Learning, Crowdsourcing},
bib2html_pubtype={Short Refereed Conference},
bib2html_funding={NSF},
abstract={Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Demonstrating that the crowd is capable of generating this input, and discussing the types of errors that occur, serves as a critical first step in designing systems that use this real-time feedback to improve systems' learning performance on-the-fly.},
}
[2] [pdf] Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents. In Proceedings of the Learning for General Competency in Video Games workshop (AAAI), January 2015.
[Bibtex]
@inproceedings(2015AAAI-Delacruz,
title={{Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents}},
author={de la Cruz, Jr., Gabriel V. and Peng, Bei and Lasecki, Walter S. and Taylor, Matthew E.},
booktitle={{Proceedings of the Learning for General Competency in Video Games workshop ({AAAI})}},
month={January},
year={2015},
wwwnote={<a href="http://www.arcadelearningenvironment.org/aaai15-workshop/">The Arcade Learning Environment</a>},
bib2html_pubtype={Refereed Workshop or Symposium},
bib2html_rescat={Reinforcement Learning, Crowdsourcing},
bib2html_funding={NSF},
abstract={Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Our results demonstrate that the crowd is capable of generating helpful input. We conclude with a discussion the types of errors that occur most commonly when engaging human workers for this task, and a discussion of how such data could be used to improve learning. Our work serves as a critical first step in designing systems that use real-time human feedback to improve the learning performance of automated systems on-the-fly.},
)

Education and Experience

Washington State University, PhD Student, Computer Science, 2013-Present
Washington State University, Post-bachelor’s Student (3 semesters), 2012-2013
Thomas Edison State College, A.A.S. Applied Health Studies, 2008-2009
United States Navy, Hospital Corpsman, 2005-2012
Cebu Institute of Technology, B.S. in Information Technology, 2001-2005

News

Publications

2016

  • David Isele, José Marcio Luna, Eric Eaton, Gabriel V. de la Cruz Jr., James Irwin, Brandon Kallaher, and Matthew E. Taylor. Lifelong Learning for Disturbance Rejection on Mobile Robots. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2016. 48% acceptance rate
    [BibTeX] [Abstract] [Download PDF] [Video]

    No two robots are exactly the same—even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Furthermore, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled.

    @inproceedings{2016IROS-Isele,
    author={Isele, David and Luna, Jos\'e Marcio and Eaton, Eric and de la Cruz, Jr., Gabriel V. and Irwin, James and Kallaher, Brandon and Taylor, Matthew E.},
    title={{Lifelong Learning for Disturbance Rejection on Mobile Robots}},
    booktitle={{Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems ({IROS})}},
    month={October},
    year={2016},
    note={48% acceptance rate},
    video={https://youtu.be/u7pkhLx0FQ0},
    bib2html_pubtype={Refereed Conference},
    abstract={No two robots are exactly the same—even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Furthermore, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled.}
    }

  • Yunshu Du, Gabriel V. de la Cruz Jr., James Irwin, and Matthew E. Taylor. Initial Progress in Transfer for Deep Reinforcement Learning Algorithms. In Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at IJCAI), New York City, NY, USA, July 2016.
    [BibTeX] [Abstract] [Download PDF]

    As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure.

    @inproceedings{2016DeepRL-Du,
    author={Du, Yunshu and de la Cruz, Jr., Gabriel V. and Irwin, James and Taylor, Matthew E.},
    title={{Initial Progress in Transfer for Deep Reinforcement Learning Algorithms}},
    booktitle={{Proceedings of Deep Reinforcement Learning: Frontiers and Challenges workshop (at {IJCAI})}},
    year={2016},
    address={New York City, NY, USA},
    month={July},
    bib2html_pubtype={Refereed Workshop or Symposium},
    abstract={As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning (TL) techniques to speed up learning in DQN. We applied this technique in two domains, Atari games and cart-pole, and show that TL can improve DQN’s performance on both tasks without altering the network structure.
    }
    }

  • David Isele, José Marcio Luna, Eric Eaton, Gabriel V. de la Cruz Jr., James Irwin, Brandon Kallaher, and Matthew E. Taylor. Work in Progress: Lifelong Learning for Disturbance Rejection on Mobile Robots. In Proceedings of the Adaptive Learning Agents (ALA) workshop (at AAMAS), Singapore, May 2016.
    [BibTeX] [Abstract] [Download PDF]

    No two robots are exactly the same — even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Further, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled. These preliminary results are an initial step towards learning robust fault-tolerant control for arbitrary robots.

    @inproceedings{2016ALA-Isele,
    author={Isele, David and Luna, Jos\'e Marcio and Eaton, Eric and de la Cruz, Jr., Gabriel V. and Irwin, James and Kallaher, Brandon and Taylor, Matthew E.},
    title={{Work in Progress: Lifelong Learning for Disturbance Rejection on Mobile Robots}},
    booktitle={{Proceedings of the Adaptive Learning Agents ({ALA}) workshop (at {AAMAS})}},
    year={2016},
    address={Singapore},
    month={May},
    abstract = {No two robots are exactly the same — even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Further, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled. These preliminary results are an initial step towards learning robust fault-tolerant control for arbitrary robots.}
    }

2015

  • Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. Towards Integrating Real-Time Crowd Advice with Reinforcement Learning. In The 20th ACM Conference on Intelligent User Interfaces (IUI), March 2015. Poster: 41% acceptance rate for poster submissions
    [BibTeX] [Abstract] [Download PDF] [DOI]

    Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Demonstrating that the crowd is capable of generating this input, and discussing the types of errors that occur, serves as a critical first step in designing systems that use this real-time feedback to improve systems’ learning performance on-the-fly.

    @inproceedings{2015IUI-Delacruz,
    author={de la Cruz, Jr., Gabriel V. and Peng, Bei and Lasecki, Walter S. and Taylor, Matthew E.},
    title={{Towards Integrating Real-Time Crowd Advice with Reinforcement Learning}},
    booktitle={{The 20th {ACM} Conference on Intelligent User Interfaces ({IUI})}},
    month={March},
    year={2015},
    doi={10.1145/2732158.2732180},
    note={Poster: 41% acceptance rate for poster submissions},
    wwwnote={<a href="http://iui.acm.org/2015/">ACM iUI-15</a>},
    bib2html_rescat={Reinforcement Learning, Crowdsourcing},
    bib2html_pubtype={Short Refereed Conference},
    bib2html_funding={NSF},
    abstract={Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Demonstrating that the crowd is capable of generating this input, and discussing the types of errors that occur, serves as a critical first step in designing systems that use this real-time feedback to improve systems' learning performance on-the-fly.},
    }

  • Gabriel V. de la Cruz Jr., Bei Peng, Walter S. Lasecki, and Matthew E. Taylor. Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents. In Proceedings of the Learning for General Competency in Video Games workshop (AAAI), January 2015.
    [BibTeX] [Abstract] [Download PDF]

    Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Our results demonstrate that the crowd is capable of generating helpful input. We conclude with a discussion the types of errors that occur most commonly when engaging human workers for this task, and a discussion of how such data could be used to improve learning. Our work serves as a critical first step in designing systems that use real-time human feedback to improve the learning performance of automated systems on-the-fly.

    @inproceedings(2015AAAI-Delacruz,
    title={{Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents}},
    author={de la Cruz, Jr., Gabriel V. and Peng, Bei and Lasecki, Walter S. and Taylor, Matthew E.},
    booktitle={{Proceedings of the Learning for General Competency in Video Games workshop ({AAAI})}},
    month={January},
    year={2015},
    wwwnote={<a href="http://www.arcadelearningenvironment.org/aaai15-workshop/">The Arcade Learning Environment</a>},
    bib2html_pubtype={Refereed Workshop or Symposium},
    bib2html_rescat={Reinforcement Learning, Crowdsourcing},
    bib2html_funding={NSF},
    abstract={Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Our results demonstrate that the crowd is capable of generating helpful input. We conclude with a discussion the types of errors that occur most commonly when engaging human workers for this task, and a discussion of how such data could be used to improve learning. Our work serves as a critical first step in designing systems that use real-time human feedback to improve the learning performance of automated systems on-the-fly.},
    )