Regularisation theory in the data driven setting
Lead Research Organisation:
University of Bath
Department Name: Mathematical Sciences
Abstract
Inverse problems deal with the reconstruction of some quantity of interest from indirectly measured data. A typical example is medical imaging, where there is no direct access to the quantity of interest (the inside of the patient's body) and imaging techniques, such X-Ray imaging and magnetic resonance imaging (MRI), are used. The classical approach to inverse problems uses models that describe the physics of the measurement. For example, in X-Ray imaging this model would describe how X-Rays pass through the body. In the era of big data, however, it becomes increasingly popular not to model the physics but to use vast amounts of data instead that relate known images with corresponding measurements.
The theory of such data driven methods, however, is not well developed yet. It is not well understood, under which conditions on the training data such methods are stable with respect to small changes in the measurement and how well they adapt to images that are different from the training images. It is important to understand this, since otherwise the reconstruction algorithm can miss important features of the image if they weren't present in the training set, such as tumours at previously unseen locations.
In this project I will extend the state-of-the-art model based theory to this data driven setting. I will study under which conditions can data driven methods achieve regularisation, i.e. when can they stably solve an otherwise unstable problem. This will make it easier to analyse stability of data driven reconstruction methods and help developing novel, stable data driven inversion methods with mathematical guarantees. I will also collaborate with the National Physical Laboratory and the Department of Chemical Engineering and Biotechnology in Cambridge on applications of my methods in imaging to reduce the time needed to acquire an image and make the reconstructions more reliable.
The theory of such data driven methods, however, is not well developed yet. It is not well understood, under which conditions on the training data such methods are stable with respect to small changes in the measurement and how well they adapt to images that are different from the training images. It is important to understand this, since otherwise the reconstruction algorithm can miss important features of the image if they weren't present in the training set, such as tumours at previously unseen locations.
In this project I will extend the state-of-the-art model based theory to this data driven setting. I will study under which conditions can data driven methods achieve regularisation, i.e. when can they stably solve an otherwise unstable problem. This will make it easier to analyse stability of data driven reconstruction methods and help developing novel, stable data driven inversion methods with mathematical guarantees. I will also collaborate with the National Physical Laboratory and the Department of Chemical Engineering and Biotechnology in Cambridge on applications of my methods in imaging to reduce the time needed to acquire an image and make the reconstructions more reliable.
Planned Impact
Inverse problems arise whenever directly accessing the quantities of interest is impossible and indirect measurements have to be used. Such problems are ubiquitous in science and technology, from microscopy to astronomy, from medical imaging to Earth exploration to non-destructive testing to airport security screening and so on.
Due to the availability of large amounts of domain-specific training data, the paradigm in many applications is shifting towards methods that rely on learning rather than careful mathematical modelling of the measurement process. However, mathematical understanding of such methods is far from being complete. The aim of this project is to reduce this gap by extending the state-of-the-art model-based inverse problems theory to the data driven setting. As a result, we will have a better understanding of data driven approaches to inverse problems and will have novel methods with improved stability properties and solid mathematical guarantees.
This will have impact on the following areas.
- Policy
Theoretical understanding of data driven methods will help shape policy on the use of such methods in sensitive applications from the societal point of view, such as medical imaging used for diagnosis or airport security screening. I will work with Cambridge based charities who help designing policy in emerging technologies in healthcare and other areas.
- Developing National Standards
In this project I will collaborate with the National Physical Laboratory, UK's National Metrology Institute responsible for developing and maintaining measurement standards. Our collaboration will help, in the long run, to design standards for the interpretation of indirect measurements using data driven approaches.
- Industry
Part of this project is applying the developed methods in image reconstruction. This type of problems occur in many areas of technology, such as material manufacturing and the energy sector. I will collaborate with the Department of Chemical Engineering and Biotechnology to speed up acquisition times in magnetic resonance imaging to enable imaging faster processes.
- Public Engagement and Outreach
An important aspect of this project is promoting mathematics to a wider audience through public engagement projects, such as the Cambridge Science Festival, and outreach, e.g., publishing articles in popular science journals.
- Academic Impact
The project will have impact on several fields of research, including imaging science and other areas that use imaging, such as chemical engineering and biomedical sciences. It will also impact data science more broadly alongside with the specialist field of inverse problems. Research will be published in high quality specialist as well as interdisciplinary journals and disseminated through international conferences and workshops. Prototype software and preprints of academic papers will be made available on public repositories.
- Teaching
The results obtained in this project will be integrated into a course taught in Cambridge. Summer research projects and other research projects related to this fellowship will be offered to students, who will benefit from participating in cutting edge research.
Due to the availability of large amounts of domain-specific training data, the paradigm in many applications is shifting towards methods that rely on learning rather than careful mathematical modelling of the measurement process. However, mathematical understanding of such methods is far from being complete. The aim of this project is to reduce this gap by extending the state-of-the-art model-based inverse problems theory to the data driven setting. As a result, we will have a better understanding of data driven approaches to inverse problems and will have novel methods with improved stability properties and solid mathematical guarantees.
This will have impact on the following areas.
- Policy
Theoretical understanding of data driven methods will help shape policy on the use of such methods in sensitive applications from the societal point of view, such as medical imaging used for diagnosis or airport security screening. I will work with Cambridge based charities who help designing policy in emerging technologies in healthcare and other areas.
- Developing National Standards
In this project I will collaborate with the National Physical Laboratory, UK's National Metrology Institute responsible for developing and maintaining measurement standards. Our collaboration will help, in the long run, to design standards for the interpretation of indirect measurements using data driven approaches.
- Industry
Part of this project is applying the developed methods in image reconstruction. This type of problems occur in many areas of technology, such as material manufacturing and the energy sector. I will collaborate with the Department of Chemical Engineering and Biotechnology to speed up acquisition times in magnetic resonance imaging to enable imaging faster processes.
- Public Engagement and Outreach
An important aspect of this project is promoting mathematics to a wider audience through public engagement projects, such as the Cambridge Science Festival, and outreach, e.g., publishing articles in popular science journals.
- Academic Impact
The project will have impact on several fields of research, including imaging science and other areas that use imaging, such as chemical engineering and biomedical sciences. It will also impact data science more broadly alongside with the specialist field of inverse problems. Research will be published in high quality specialist as well as interdisciplinary journals and disseminated through international conferences and workshops. Prototype software and preprints of academic papers will be made available on public repositories.
- Teaching
The results obtained in this project will be integrated into a course taught in Cambridge. Summer research projects and other research projects related to this fellowship will be offered to students, who will benefit from participating in cutting edge research.
Publications
Bungert L
(2022)
Eigenvalue problems in ^{8}: optimality conditions, duality, and relations with optimal transport
in Communications of the American Mathematical Society
Korolev Y
(2022)
Two-Layer Neural Networks with Values in a Banach Space
in SIAM Journal on Mathematical Analysis
Description | Working on this project I realised that the main difficulty in using machine learning methods -- more specifically, neural networks -- in inverse problems is that neural networks need to be understood as functions acting between abstract infinite-dimensional spaces. However, very limited theory was available on the subject. I wrote one paper studying neural networks from this viewpoint, but it was clear that the size of this problem is much larger. I organised a mini-simposium on this topic at the biggest international conference in applied mathematics in 2023 and am organising a specialist workshop in 2024. I started new collaborations and have ongoing work in this direction. Data-driven model correction in inverse problems also emerged as an important theme. This is the task of combining mathematical equations with training data to model the process of data acquisition in an inverse problem. It turned out that there is an delicate interplay between the way the neural networks are trained and the kind of algorithms for solving the inverse problem where these networks will be used. I started a collaboration with researches working on model correction in imaging inverse problems and we were able to develop methods that reduced required training time from several days to a couple of hours. |
Exploitation Route | This project has a significant theoretical component, so a lot of what has been produced will influence further research on machine learning and inverse problems. The scientific events I organised on the subject will also stimulate research in the area, facilitate collaboration, and strengthen UK's position as an important player in machine learning research. Data-driven model correction methods described above have the potential to go into clinical practice of biomedical imaging, but this is a very long process. However, time constraints are a major bottlenck in applying novel image reconstruction methods in clinical practice and the significant reduction in training time that our work achieved is an important step in facilitating the adoption of these methods. |
Sectors | Education Healthcare Manufacturing including Industrial Biotechology Pharmaceuticals and Medical Biotechnology |
Description | Machine Learning in Infinite Dimensions (Scheme 1 Conference Grant) |
Amount | £2,750 (GBP) |
Funding ID | 12332 |
Organisation | London Mathematical Society |
Sector | Academic/University |
Country | United Kingdom |
Start |
Title | Improved data-driven operator correction method for imaging inverse problems |
Description | Together with my collaborators from Oulu and Warwick, I proposed a data-driven operator correction method for imaging inverse problems that significantly reduces the training time (from several days to about an hour). |
Type Of Material | Computer model/algorithm |
Year Produced | 2024 |
Provided To Others? | No |
Impact | We are currently finalising the paper and it will be submitted soon. Once the paper has been submitted, a copy will be put on arxiv and will be freely available to others. A book chapter on this problem will be published soon. |
Description | Data driven operator correction |
Organisation | University of Oulu |
Country | Finland |
Sector | Academic/University |
PI Contribution | I contribute my expertise in inverse problems and machine learning. |
Collaborator Contribution | My collaborators contribute their expertise in machine learning and medical imaging, experimental data, and computing resources. |
Impact | We designed methods that significantly reduced the training time for data-driven operator correction methods in imaging inverse problems (from several days to about an hour). |
Start Year | 2021 |
Description | Data driven operator correction |
Organisation | University of Warwick |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I contribute my expertise in inverse problems and machine learning. |
Collaborator Contribution | My collaborators contribute their expertise in machine learning and medical imaging, experimental data, and computing resources. |
Impact | We designed methods that significantly reduced the training time for data-driven operator correction methods in imaging inverse problems (from several days to about an hour). |
Start Year | 2021 |
Description | Image reconstruction in light microscopy |
Organisation | Medical Research Council (MRC) |
Department | MRC Laboratory of Molecular Biology (LMB) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I contribute my expertise in inverse problems, imaging and machine learning. I co-supervised students working with us on this project. |
Collaborator Contribution | My collaborators contribute their expertise in biomedical imaging. They also provide experimental data and computing facilities. |
Impact | It is a multidisciplilnary collaboration with biologists (MRC Laboratory of Molecular Biology) and microscopists (Cambridge Advanced Imaging Centre). This collaboration started in 2018 but has been growing in depth, especially since the start of this project. We proposed a new method for improving light-sheet microscopy images described in the Research Datasets, Databases & Models section. We also obtained funding for a research software engineer to adapt our methods to exascale computing (Exascale Computing Algorithms & Infrastructures Benefiting UK Research scheme of the EPSRC). As a holder of a postdoctoral fellowship, I was not eligible to be a CoI on the grant. |
Start Year | 2018 |
Description | Image reconstruction in light microscopy |
Organisation | University of Cambridge |
Department | Cambridge Advanced Imaging Centre |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I contribute my expertise in inverse problems, imaging and machine learning. I co-supervised students working with us on this project. |
Collaborator Contribution | My collaborators contribute their expertise in biomedical imaging. They also provide experimental data and computing facilities. |
Impact | It is a multidisciplilnary collaboration with biologists (MRC Laboratory of Molecular Biology) and microscopists (Cambridge Advanced Imaging Centre). This collaboration started in 2018 but has been growing in depth, especially since the start of this project. We proposed a new method for improving light-sheet microscopy images described in the Research Datasets, Databases & Models section. We also obtained funding for a research software engineer to adapt our methods to exascale computing (Exascale Computing Algorithms & Infrastructures Benefiting UK Research scheme of the EPSRC). As a holder of a postdoctoral fellowship, I was not eligible to be a CoI on the grant. |
Start Year | 2018 |
Description | Articles in the Plus magazine |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Media (as a channel to the public) |
Results and Impact | I co-authored articles "What is deep learning?", "What is machine learning?" and "What is deep learning?" in the Plus magazine. We were later approached by someone from the industry who was looking for introductory material on machine learning to present to their collegues. |
Year(s) Of Engagement Activity | 2024 |
URL | https://plus.maths.org/content/what-deep-learning |
Description | CIMPA summer school |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | I was asked to give a series of lectures on applications of functional analysis in machine learning at a CIMPA (Centre International de Mathématiques Pures et Appliquées) summer school in Tunisia. The main audience is students from low income countries. This summer school will help these students from a disadvantaged background learn about modern machine learning and improve their chances for highly qualified employment or further studies. |
Year(s) Of Engagement Activity | 2024 |
Description | Somerscience Festival |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | I took part in the Somerscience Festival where my colleagues and I organised an interactive stand with activities related to mathematics and machine learning. |
Year(s) Of Engagement Activity | 2023 |
URL | https://somerscience.co.uk/somerscience-festival-2023-highlights/ |