📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Representation Learning and Optimisation in Function Space for Deep Learning

Lead Research Organisation: University of Bristol
Department Name: Mathematics

Abstract

Deep neural networks (DNNs) are powerful machine learning models that excel at learning high-level representations for complex tasks. However, the process by which DNNs learn these representations across layers remains poorly understood. A key theoretical approach to studying DNNs is through infinite-width limits, which highlight connections to kernel methods. Traditional infinite-width limits, however, eliminate representation learning.
This project explores novel approaches to understanding and improving deep learning through the lens of function space and representation learning. Building on the Deep Kernel Machine (DKM) framework, which retains representation learning in the infinite-width limit, we develop extensions to more complex architectures like convolutional networks. Our work on convolutional DKMs demonstrates state-of-the-art performance for kernel methods on image classification tasks, narrowing the gap with neural networks.
We also investigate optimization dynamics in function space rather than parameter space. By developing efficient methods to measure and control function-space learning rates, we gain new insights into optimizer behavior and enable improved hyperparameter transfer across model scales. Our Function-space Learning Rate Matching (FLeRM) approach allows hyperparameters optimized on small models to be effectively transferred to larger architectures.
This research contributes to both the theoretical understanding of representation learning in deep models and practical algorithms for improving neural network training and scaling. By bridging kernel methods and modern deep learning techniques, we aim to enhance the flexibility and performance of AI models across diverse applications.

Yang et al. (2021) recently introduced a new type of infinite-width limit that also retains representation learning, called the Deep Kernel Machine. It is the first entirely kernel-based deep learning method that gives comparable flexibility to DNNs. Kernel-based methods operate based on the similarity of different data points, which is fundamentally different from the feature-based regime utilised by neural networks, where learning is directly based on the value of data points. This new development shows promise, but the original paper only considers an analogue to simple fully-connected neural networks. Many of the interesting application domains utilise more complicated neural network architectures, such as convolutional neural networks for image tasks, or transformer networks and recurrent networks for sequential tasks like natural language processing.

The goal of this project is to develop new extensions to the deep kernel machine literature (and perhaps more generally the deep kernel methods literature) to create deep kernel machine analogues to these complicated architectures, e.g. a convolutional deep kernel machine. Doing so will provide both a better theoretical understanding of how neural networks learn, which is very important for explainable AI, as well as new practical algorithms for performing supervised learning tasks.

References:

Yang, A., Robeyns, M., Milsom, E., Schoots, N., & Aitchison, L.. (2021). A theory of representation learning in deep neural networks gives a deep generalisation of kernel methods.

https://www.ukri.org/what-we-offer/browse-our-areas-of-investment-and-support/artificial-intelligence-technologies/

Planned Impact

The COMPASS Centre for Doctoral Training will have the following impact.

Doctoral Students Impact.

I1. Recruit and train over 55 students and provide them with a broad and comprehensive education in contemporary Computational Statistics & Data Science, leading to the award of a PhD. The training environment will be built around a set of multilevel cohorts: a variety of group sizes, within and across year cohort activities, within and across disciplinary boundaries with internal and external partners, where statistics and computation are the common focus, but remaining sensitive to disciplinary needs. Our novel doctoral training environment will powerfully impact on students, opening their eyes to not only a range of modern technical benefits and opportunities, but on the power of team-working with people from a range of backgrounds to solve the most important problems of the day. They will learn to apply their skills to achieve impact by collaborative working with internal and external partners, such as via our Rapid Response Teams, Policy Workshops & Statistical Clinics.

I2. As well as advanced training in computational statistics and data science, our students will be impacted by exposure to, and training in, important cognate topics such as ethics, responsible innovation, equality, diversity and inclusion, policy, effective communication and dissemination, enterprise, impact and consultancy skills. It is vital for our students to understand that their training will enable them to have a powerful impact on the wider world, so, e.g., AI algorithms they develop should not be discriminatory, and statistical methodologies should be reproducible, and statistical results accurately and comprehensibly communicated to the general public and policymakers.

I3. The students will gain experience via collaborations with academic partners within the University in cognate disciplines, and a wide range of external industrial & government partners. The students will be impacted by the structured training programmes of the UK Academy of Postgraduate Training in Statistics, the Bristol Doctoral College, the Jean Golding Institute, the Alan Turing Institute and the Heilbronn Institute for Mathematical Sciences, which will be integrated into our programme.

I4. Having received an excellent training, the students will then impact powerfully on the world in their future fruitful careers, spreading excellence.

Impact on our Partners & ourselves.

I5. Direct impacts will be achieved by students engaging with, and working on projects with, our academic partners, with discipline-specific problems arising in engineering, education, medicine, economics, earth sciences, life sciences and geographical sciences, and our external partners Adarga, the Atomic Weapons Establishment, CheckRisk, EDF, GCHQ, GSK, the Office for National Statistics, Sciex, Shell UK, Trainline and the UK Space Agency. The students will demonstrate a wide range of innovation with these partners, will attract engagement from new partners, and often provide attractive future employment matches for students and partners alike.

Wider Societal Impact

I6. COMPASS will greatly benefit the UK by providing over 55 highly trained PhD graduates in an area that is known to be suffering from extreme, well-known, shortages in the people pipeline nationally. COMPASS CDT graduates will be equipped for jobs in sectors of high economic value and national priority, including data science, analytics, pharmaceuticals, security, energy, communications, government, and indeed all research labs that deal with data. Through their training, they will enable these organisations to make well-informed and statistically principled decisions that will allow them to maximise their international competitiveness and contribution to societal well-being. COMPASS will also impact positively on the wider student community, both now and sustainably into the future.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023569/1 31/03/2019 29/09/2027
2593163 Studentship EP/S023569/1 30/09/2021 29/09/2025 Edward Milsom