Automatic Design of Experiments in Chemical Micro-reactors

Lead Research Organisation: Imperial College London
Department Name: Mathematics

Abstract

The area of experimental design concerns trying to maximise (or minimise) a certain objective function by selecting different experimental inputs. Bayesian Optimisation has proven to provide solutions to such problems. Query data is used to calculate a posterior from a surrogate model, which we can use to select the next experiment (by optimising a function called the acquisition function). However, in the classical setting, Bayesian Optimisation assumes we select a single experiment and immediately obtain an observation.

Micro-reactors are changing laboratory chemistry as they allow us to carry out many experiments on the micro-scale and as such, they require automatic experimental design. Micro-droplets travel through the reactor and each can be considered to be a single experiment. However, the problem brings many complications. A very important part of the problem concerns time-delay. We will have to choose many new experiments before we receive the results from previous ones. We will also be receiving observations from multiple sources, some will be quick but inaccurate, while others will be accurate but slow
and expensive. We will have restrictions on how much we want to vary our inputs, as we try to maintain the chemical reaction in steady-state. Other important challenges include multi-objective optimisation, input delay, and safety constraints.

Many of these complications have been studied in isolated environments. In particular, there is extensive literature on multi-fidelity and asynchronous Bayesian Optimisation. The objective of the project is to propose methods that can take into account as many complications as possible, at the same time. To provide such a method we will have to combine and fundamentally change previously
proposed methods, in novel ways. Having an effective way of designing such experiments would allow us to have more efficient production of chemicals, and reduce waste. The impact of the project would not be restricted to
chemical production. The setting of experimental design is important in many areas, from food manufacturing to the optimisation of machine learning hyper-parameters. The research project falls in an intersection of ESPRC's themes of Artificial Intelligence and Robotics, Engineering, and Mathematical Sciences.

The research is being done in collaboration with chemical manufacturer BASF, whom are providing funding for the project. The collaboration will hopefully allow us to test any developed methods in real life settings, involving micro-reactors and other chemical experiments. It should also help us bridge the gap between the mathematical nature of the project, and the chemical applications it is trying to tackle.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2605895 Studentship EP/S023151/1 03/10/2020 30/09/2024 Jose Folch Urroz