Geometric deep learning

Lead Research Organisation: University of Oxford

Abstract

Structures encountered in our everyday lives such as social networks, proteins and molecules, products sold online, and others are best interpreted as a graph due to the unique relationships between datapoints, such as the number of retweets between two users on Twitter or the type of bond between two atoms. As opposed to classical deep learning which was designed for Euclidean, more 'grid-like' geometry, graph machine learning (a subset of geometric deep learning) is a field of machine learning that takes advantage of additional non-Euclidean structure to improve models with applications in drug discovery, sociology and more.

This project aims to enhance graph machine learning methods with ideas from differential geometry, the study of smooth shapes and manifolds. While early works in graph machine learning largely take inspiration from generalising existing deep learning models such as convolutional neural networks, differential geometry offers an orthogonal source of increasing sophistication due to its vast, and in this context largely unexplored, wealth of concepts and theoretical results established over many years. The first work in this project "Understanding over-squashing and bottlenecks on graph via curvature" uses new theoretical results on graphs based on Ricci curvature, a key concept from differential geometry, to design and evaluate a new pre-processing method on graphs that improved performance of existing graph neural networks on all datasets tested. Future works will include designing new model architectures using curvature to evolve how information travels across a graph alongside the model processes the features (a curvature-based neural network resembling a Beltrami flow), and extending existing differential equation-based graph models to use stochastic differential equations, allowing for the modelling of graph datasets that evolve randomly through time. All these directions of research are first-of-their-kind and form part of an exciting renaissance of using different geometries to expand the applications in which machine learning can be used.

This project falls within the ESPRC Geometry and Topology research area under the Mathematical Sciences theme, as the novelty of the research has a solid footing in differential geometry and geometric deep learning. It also is likely to intersect with other themes, as we aim to directly apply powerful concepts from geometry to the world of machine learning to propose new, efficient methods and models in fields such as biology and sociology.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2248365 Studentship EP/S023151/1 01/10/2020 30/09/2024 James Topping