Research in Zero-Shot Coordination and Delay Graph Neural Networks

Lead Research Organisation: University of Oxford

Abstract

This research follows on from the two mini-projects undertaken in the first year of the AIMS CDT, in the areas of multi-agent reinforcement learning (MARL) and graph machine learning (ML).

The first project involves continued research into zero-shot coordination (ZSC) in continuous state-action spaces. The standard problem formulation in cooperative MARL is self-play, where agents are optimised to perform well with their training time partners. However, this often leads to poor cross-play performance when agents are paired with novel partners, as agents learn 'arbitrary' conventions with no inherent meaning, or exploit symmetry breaking within games. To address this, the problem of ZSC is to train agents that cooperate optimally with partners not seen at training time.
Previous work in ZSC has focused on domains with discrete state-action spaces such as Hanabi, but no work has been done to adapt ZSC methods into continuous state-action spaces or to explore new ones in this domain. The objective of the research undertaken within my DPhil is to develop benchmark ZSC settings in continuous-state action spaces, adapt existing ZSC algorithms such as off-belief learning into continuous spaces, and develop novel algorithms.

The second project involves research into delay graph neural networks (GNNs). Many well-known GNN architectures can be thought of as discretisations of the diffusion partial differential equation (PDE) on graphs. Delay differential equations (DDEs) are a type of PDE in which the derivative of the unknown function is given in terms of both the value of the function at the current time and earlier times; they are useful in many applications in control involving communication networks because delays and aftereffects are common in real-world settings. However, DDEs have not been applied in graph machine learning/geometric deep learning.

This project intends to expand on the approach of GNNs as discretisations of PDEs by considering DDEs as GNNs. The longer-term aims are to try and address how to go beyond the message-passing framework, common in graph ML, to handle long-range dependencies, which message-passing neural networks often struggle with. A particular application that will be explored is in modelling hardware delays resulting from parallel computation on large/webscale graphs.
This proposal consists of machine learning research, which falls under the EPSRC research areas of engineering and information technologies. There is no explicit industry collaboration in either research area at this stage, though the ZSC work involves some collaborators at DeepMind, and the graph ML work may involve future collaboration with drug discovery or computing hardware companies.

This proposal consists of machine learning research, which falls under the EPSRC research areas of engineering and information technologies. There is no explicit industry collaboration in either research area at this stage, though the ZSC work involves some collaborators at DeepMind, and the graph ML work may involve future collaboration with drug discovery or computing hardware companies.

Planned Impact

AIMS's impact will be felt across domains of acute need within the UK. We expect AIMS to benefit: UK economic performance, through start-up creation; existing UK firms, both through research and addressing skills needs; UK health, by contributing to cancer research, and quality of life, through the delivery of autonomous vehicles; UK public understanding of and policy related to the transformational societal change engendered by autonomous systems.

Autonomous systems are acknowledged by essentially all stakeholders as important to the future UK economy. PwC claim that there is a £232 billion opportunity offered by AI to the UK economy by 2030 (10% of GDP). AIMS has an excellent track record of leadership in spinout creation, and will continue to foster the commercial projects of its students, through the provision of training in IP, licensing and entrepreneurship. With the help of Oxford Science Innovation (investment fund) and Oxford University Innovation (technology transfer office), student projects will be evaluated for commercial potential.

AIMS will also concretely contribute to UK economic competitiveness by meeting the UK's needs for experts in autonomous systems. To meet this need, AIMS will train cohorts with advanced skills that span the breadth of AI, machine learning, robotics, verification and sensor systems. The relevance of the training to the needs of industry will be ensured by the industrial partnerships at the heart of AIMS. These partnerships will also ensure that AIMS will produce research that directly targets UK industrial needs. Our partners span a wide range of UK sectors, including energy, transport, infrastructure, factory automation, finance, health, space and other extreme environments.

The autonomous systems that AIMS will enable also offer the prospect of epochal change in the UK's quality of life and health. As put by former Digital Secretary Matt Hancock, "whether it's improving travel, making banking easier or helping people live longer, AI is already revolutionising our economy and our society." AIMS will help to realise this potential through its delivery of trained experts and targeted research. In particular, two of the four Grand Challenge missions in the UK Industrial Strategy highlight the positive societal impact underpinned by autonomous systems. The "Artificial Intelligence and data" challenge has as its mission to "Use data, Artificial Intelligence and innovation to transform the prevention, early diagnosis and treatment of chronic diseases by 2030". To this mission, AIMS will contribute the outputs of its research pillar on cancer research. The "Future of mobility" challenge highlights the importance the autonomous vehicles will have in making transport "safer, cleaner and better connected." To this challenge, AIMS offers the world-leading research of its robotic systems research pillar.

AIMS will further promote the positive realisation of autonomous technologies through direct influence on policy. The world-leading academics amongst AIMS's supervisory pool are well-connected to policy formation e.g. Prof Osborne serving as a Commissioner on the Independent Commission on the Future of Work. Further, Dr Dan Mawson, Head of the Economy Unit; Economy and Strategic Analysis Team at BEIS will serve as an advisor to AIMS, ensuring bidirectional influence between policy objectives and AIMS research and training.

Broad understanding of autonomous systems is crucial in making a society robust to the transformations they will engender. AIMS will foster such understanding through its provision of opportunities for AIMS students to directly engage with the public. Given the broad societal importance of getting autonomous systems right, AIMS will deliver core training on the ethical, governance, economic and societal implications of autonomous systems.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024050/1 01/10/2019 31/03/2028
2579030 Studentship EP/S024050/1 01/10/2021 31/12/2025 Benjamin Gutteridge