Intelligent Robot Learning Laboratory (IRL Lab) Zhaodong Wang

photo

CONTACT INFORMATION:

Zhaodong Wang
PhD Student, Computer Science
Email: zhaodong.wang@wsu.edu
Office: Dana Hall 2


My Story

My name is Zhaodong Wang. I am a Ph.D. student currently working with Dr. Matthew E. Taylor since 2014. I obtained my bachelor degree of Electrical Engineering from University of Science and Technology of China in 2014.

My Research

My research interest includes Reinforcement Learning, Transfer Learning and Real Robotics. I am mostly motivated by AI and Robotics related techniques changing human’s life.

Current Projects

By: Zhaodong Wang and Matthew E. Taylor

Many learning methods such as reinforcement learning suffers from a slow beginning especially in complicated domains. The motivation of transfer learning is to use limited prior knowledge to help learning agents bootstrap at the start and thus achieve overall improvements on learning performance. Due to limited quantity or quality of prior knowledge, how to make the transfer more efficient and effective remains an interesting point. [1, 2]

[1] [pdf] Zhaodong Wang and Matthew E. Taylor. Improving Reinforcement Learning with Confidence-Based Demonstrations. In Proceedings of the 26th International Conference on Artificial Intelligence (IJCAI), August 2017. 26% acceptance rate
[Bibtex]
@inproceedings{2017IJCAI-Wang,
author={Wang, Zhaodong and Taylor, Matthew E.},
title={{Improving Reinforcement Learning with Confidence-Based Demonstrations}},
booktitle={{Proceedings of the 26th International Conference on Artificial Intelligence ({IJCAI})}},
month={August},
year={2017},
note={26% acceptance rate},
bib2html_pubtype={Refereed Conference},
bib2html_rescat={Reinforcement Learning},
abstract={Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent's performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent's learning algorithm or representation. The target agent then estimates the source agent's policy and improves upon it. The key contribution of this work is to show that leveraging the target agent's uncertainty in the source agent's policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.}
}
[2] [pdf] Zhaodong Wang and Matthew E. Taylor. Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study. In AAAI 2016 Spring Symposium, March 2016.
[Bibtex]
@inproceedings{2016AAAI-SSS-Wang,
author={Zhaodong Wang and Matthew E. Taylor},
title={{Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study}},
booktitle={{{AAAI} 2016 Spring Symposium}},
month={March},
year={2016},
bib2html_pubtype={Refereed Workshop or Symposium},
abstract={There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent's demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other's internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent's confidence in its representation of the source agent's policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.}
}

Videos & Other Media:

By: Zhaodong Wang and Matthew E. Taylor

The purpose of this project is to build an intelligent multi-robot system to manage the usage of bins for harvest work in orchard. It is involved with the auto navigation of robots in orchard environment and the cooperation with human pickers. The value of this multi-robot bin managing system is in realizing the autonomous work of robots in tough outdoor environment and improving the harvest efficiency for the agriculture work. [1]

[1] [pdf] Yawei Zhang, Yunxiang Ye, Zhaodong Wang, Matthew E. Taylor, Geoffrey A. Hollinger, and Qin Zhang. Intelligent In-Orchard Bin-Managing System for Tree Fruit Production. In Proceedings of the Robotics in Agriculture workshop (at ICRA), May 2015.
[Bibtex]
@inproceedings{2015ICRA-Zhang,
author={Yawei Zhang and Yunxiang Ye and Zhaodong Wang and Matthew E. Taylor and Geoffrey A. Hollinger and Qin Zhang},
title={{Intelligent In-Orchard Bin-Managing System for Tree Fruit Production}},
booktitle={{Proceedings of the Robotics in Agriculture workshop (at {ICRA})}},
month={May},
year={2015},
bib2html_pubtype={Refereed Workshop or Symposium},
abstract={The labor-intensive nature of harvest in the tree fruit industry makes it particularly sensitive to labor shortages. Technological innovation is thus critical in order to meet current demands without significantly increasing prices. This paper introduces a robotic system to help human workers during fruit harvest. A second-generation prototype is currently being built and simulation results demonstrate potential improvement in productivity.}
}

News

Publications

2017

  • Zhaodong Wang and Matthew E. Taylor. Improving Reinforcement Learning with Confidence-Based Demonstrations. In Proceedings of the 26th International Conference on Artificial Intelligence (IJCAI), August 2017. 26% acceptance rate
    [BibTeX] [Abstract] [Download PDF]

    Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent’s performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent’s learning algorithm or representation. The target agent then estimates the source agent’s policy and improves upon it. The key contribution of this work is to show that leveraging the target agent’s uncertainty in the source agent’s policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.

    @inproceedings{2017IJCAI-Wang,
    author={Wang, Zhaodong and Taylor, Matthew E.},
    title={{Improving Reinforcement Learning with Confidence-Based Demonstrations}},
    booktitle={{Proceedings of the 26th International Conference on Artificial Intelligence ({IJCAI})}},
    month={August},
    year={2017},
    note={26% acceptance rate},
    bib2html_pubtype={Refereed Conference},
    bib2html_rescat={Reinforcement Learning},
    abstract={Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent's performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent's learning algorithm or representation. The target agent then estimates the source agent's policy and improves upon it. The key contribution of this work is to show that leveraging the target agent's uncertainty in the source agent's policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.}
    }

  • Yunxiang Ye, Zhaodong Wang, Dylan Jones, Long He, Matthew E. Taylor, Geoffrey A. Hollinger, and Qin Zhang. Bin-Dog: A Robotic Platform for Bin Management in Orchards. Robotics, 6(2), 2017.
    [BibTeX] [Abstract] [Download PDF] [DOI]

    Bin management during apple harvest season is an important activity for orchards. Typically, empty and full bins are handled by tractor-mounted forklifts or bin trailers in two separate trips. In order to simplify this work process and improve work efficiency of bin management, the concept of a robotic bin-dog system is proposed in this study. This system is designed with a “go-over-the-bin” feature, which allows it to drive over bins between tree rows and complete the above process in one trip. To validate this system concept, a prototype and its control and navigation system were designed and built. Field tests were conducted in a commercial orchard to validate its key functionalities in three tasks including headland turning, straight-line tracking between tree rows, and “go-over-the-bin.” Tests of the headland turning showed that bin-dog followed a predefined path to align with an alleyway with lateral and orientation errors of 0.02 m and 1.5°. Tests of straight-line tracking showed that bin-dog could successfully track the alleyway centerline at speeds up to 1.00 m·s−1 with a RMSE offset of 0.07 m. The navigation system also successfully guided the bin-dog to complete the task of go-over-the-bin at a speed of 0.60 m·s−1. The successful validation tests proved that the prototype can achieve all desired functionality.

    @article{2017Robotics-Ye,
    author={Ye, Yunxiang and Wang, Zhaodong and Jones, Dylan and He, Long and Taylor, Matthew E. and Hollinger, Geoffrey A. and Zhang, Qin},
    title={{Bin-Dog: A Robotic Platform for Bin Management in Orchards}},
    journal={{Robotics}},
    volume={6},
    year={2017},
    number={2},
    url={http://www.mdpi.com/2218-6581/6/2/12},
    issn={2218-6581},
    doi={10.3390/robotics6020012},
    abstract={Bin management during apple harvest season is an important activity for orchards. Typically, empty and full bins are handled by tractor-mounted forklifts or bin trailers in two separate trips. In order to simplify this work process and improve work efficiency of bin management, the concept of a robotic bin-dog system is proposed in this study. This system is designed with a “go-over-the-bin” feature, which allows it to drive over bins between tree rows and complete the above process in one trip. To validate this system concept, a prototype and its control and navigation system were designed and built. Field tests were conducted in a commercial orchard to validate its key functionalities in three tasks including headland turning, straight-line tracking between tree rows, and “go-over-the-bin.” Tests of the headland turning showed that bin-dog followed a predefined path to align with an alleyway with lateral and orientation errors of 0.02 m and 1.5°. Tests of straight-line tracking showed that bin-dog could successfully track the alleyway centerline at speeds up to 1.00 m·s−1 with a RMSE offset of 0.07 m. The navigation system also successfully guided the bin-dog to complete the task of go-over-the-bin at a speed of 0.60 m·s−1. The successful validation tests proved that the prototype can achieve all desired functionality.}
    }

2016

  • Zhaodong Wang and Matthew E. Taylor. Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study. In AAAI 2016 Spring Symposium, March 2016.
    [BibTeX] [Abstract] [Download PDF]

    There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent’s demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other’s internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent’s confidence in its representation of the source agent’s policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.

    @inproceedings{2016AAAI-SSS-Wang,
    author={Zhaodong Wang and Matthew E. Taylor},
    title={{Effective Transfer via Demonstrations in Reinforcement Learning: A Preliminary Study}},
    booktitle={{{AAAI} 2016 Spring Symposium}},
    month={March},
    year={2016},
    bib2html_pubtype={Refereed Workshop or Symposium},
    abstract={There are many successful methods for transferring information from one agent to another. One approach, taken in this work, is to have one (source) agent demonstrate a policy to a second (target) agent, and then have that second agent improve upon the policy. By allowing the target agent to observe the source agent's demonstrations, rather than relying on other types of direct knowledge transfer like Q-values, rules, or shared representations, we remove the need for the agents to know anything about each other's internal representation or have a shared language. In this work, we introduce a refinement to HAT, an existing transfer learning method, by integrating the target agent's confidence in its representation of the source agent's policy. Results show that a target agent can effectively 1) improve its initial performance relative to learning without transfer (jumpstart) and 2) improve its performance relative to the source agent (total reward). Furthermore, both the jumpstart and total reward are improved with this new refinement, relative to learning without transfer and relative to learning with HAT.}
    }

2015

  • Yawei Zhang, Yunxiang Ye, Zhaodong Wang, Matthew E. Taylor, Geoffrey A. Hollinger, and Qin Zhang. Intelligent In-Orchard Bin-Managing System for Tree Fruit Production. In Proceedings of the Robotics in Agriculture workshop (at ICRA), May 2015.
    [BibTeX] [Abstract] [Download PDF]

    The labor-intensive nature of harvest in the tree fruit industry makes it particularly sensitive to labor shortages. Technological innovation is thus critical in order to meet current demands without significantly increasing prices. This paper introduces a robotic system to help human workers during fruit harvest. A second-generation prototype is currently being built and simulation results demonstrate potential improvement in productivity.

    @inproceedings{2015ICRA-Zhang,
    author={Yawei Zhang and Yunxiang Ye and Zhaodong Wang and Matthew E. Taylor and Geoffrey A. Hollinger and Qin Zhang},
    title={{Intelligent In-Orchard Bin-Managing System for Tree Fruit Production}},
    booktitle={{Proceedings of the Robotics in Agriculture workshop (at {ICRA})}},
    month={May},
    year={2015},
    bib2html_pubtype={Refereed Workshop or Symposium},
    abstract={The labor-intensive nature of harvest in the tree fruit industry makes it particularly sensitive to labor shortages. Technological innovation is thus critical in order to meet current demands without significantly increasing prices. This paper introduces a robotic system to help human workers during fruit harvest. A second-generation prototype is currently being built and simulation results demonstrate potential improvement in productivity.}
    }