Learning long chain of actions through imitation

Lead Research Organisation: University of Bristol
Department Name: Aerospace Engineering

Abstract

I hope that in the future we will have teams of robots working together to solve complex tasks such as building, surgery or serving dinner. Currently, this is done by humans using a form of hierarchy, where someone in charge sets tasks and makes plans. I envision this as being split into two parts:
- asynchronous job coordination
- task completion
During my PhD, I choose to focus on the latter, as coordinating a job already requires a system capable of undertaking a task. Based on my work during Master and literature explored in this area [1, 2], I believe this can be achieved using Hierarchical Reinforcement Learning. The ultimate goal is to create a system capable of learning multiple skills that can be used to solve several problems sharing common features.
This can be further split into two parts:
- skill selector (high level)
- skill performer (low level)
Current methods succeed at learning complex skills by making use of demonstrations [3]. A question I would like to answer is: can multiple skills be learned and applied concomitantly? This is inspired by how neuroscience describes the human brain to be functioning. More specifically, people benefit from learning multiple, similar tasks in parallel, rather than in isolation. Therefore, the new framework would replace the need of demonstrations or exhaustive exploration with shared knowledge between tasks.
Once a model has been trained on a collection of skills, learning to apply them on multiple new tasks can be investigated. Latest articles in this area showcase models capable of learning one new task [4] or just navigate the multitude of tasks demonstrated [1]. However, I would like to investigate learning multiple previously unseen tasks, using the same inspiration as before. The major difference would be in the way HRL is used, as I believe a different architecture would be required.
Finally, I would like to apply the resulting methods on a real-world application. This will be achieved by making use of a robotic arm to solve a toy example that requires multiple skills, such as opening and closing of a drawer and picking and placing of an object. Example of problems includes tidying up or fetching a hidden object.

Planned Impact

FARSCOPE-TU will deliver a step change in UK capabilities in robotics and autonomous systems (RAS) by elevating technologies from niche to ubiquity. It meets the critical need for advanced RAS, placing the UK in prime position to capture a significant proportion of the estimated $18bn global market in advanced service robotics. FARSCOPE-TU will provide an advanced training network in RAS, pump priming a generation of professional and adaptable engineers and leaders who can integrate fundamental and applied innovation, thereby making impact across all the "four nations" in EPSRC's Delivery Plan. Specifically, it will have significant immediate and ongoing impact in the following six areas:
1. Training: The FARSCOPE-TU coherent strategy will deliver five cohorts trained in state-of-the-art RAS research, enterprise, responsible innovation and communication. Our students will be trained with wide knowledge of all robotics, and deep specialist skills in core domains, all within the context of the 'innovation pipeline', meeting the need for 'can-do' research engineers, unafraid to tackle new and emergent technical challenges. Students will graduate as future thought leaders, ready for deployment across UK research and industrial innovation.
2. Partner and industrial impact: The FARSCOPE-TU programme has been designed in collaboration with our industrial and end-user partners, including: DSTL; Thales; Atkins; Toshiba; Roke Manor Research; Network Rail; BT; National Nuclear Lab; AECOM; RNTNE Hospital; Designability; Bristol Heart Inst.; FiveAI; Ordnance Survey; TVS; Shadow Robot Co.; React AI; RACE (part of UKAEA) and Aimsun. Partners will deliver context and application-oriented training direct to the students throughout the course, ensuring graduates are perfectly placed to transition into their businesses and deliver rapid impact.
3. RAS community: FARSCOPE-TU will act as multidisciplinary centre in robotics and autonomous systems for the whole RAS community, provide an inclusive model for future research and training centres and bring new opportunities for networking between other centres. These include joint annual conference with other RAS CDTs and training exchanges. FARSCOPE-TU will generate significant international exposure within and beyond the RAS community, including major robotics events such as ICRA and IROS, and will interface directly with the UK-RAS network.
4. Societal Impact: FARSCOPE-TU will promote an informed debate on the adoption of autonomous robotics in society, cutting through hype and fear while promoting the highest levels of ethics and safety. All students will design and deliver public engagement events to schools and the public, generating knock-on impact in two ways: greater STEM uptake enhances future economic potential, and greater awareness makes people better users of robots, amplifying societal benefits.
5. Economic impact: FARSCOPE-TU will not only train cohorts in fundamental and applied research but will also demonstrate how to bridge the "technology valley of death" between lower and higher TRL. This will enable students to exploit their ideas in technology incubators (incl. BRL incubator, SetSquared and EngineShed) and through IP protection. FARSCOPE-TU's vision of ubiquitous robotics will extend its impact across all UK industrial and social sectors, from energy suppliers, transport and agriculture to healthcare, aging and human-machine interaction. It will pump-prime ubiquitous UK robotics, inspiring and enabling myriad new businesses and economic and social impact opportunities.
6. Long-term Impact: FARSCOPE-TU will have long-term impact beyond the funded lifetime of the Centre through a network for alumni, enabling knowledge exchange and networking between current and past students, and with partners and research groups. FARSCOPE-TU will have significant positive impact on the 80-strong non-CDT postgraduate student body in BRL, extending best-practice in supervision and training.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S021795/1 01/10/2019 31/03/2028
2260382 Studentship EP/S021795/1 01/10/2019 15/09/2023 Mihai Anca