Robust machine learning: algorithms, numerical analysis and efficient software

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Mathematics

Abstract

In today's data-driven world there is an ever increasing need for statistical methods to analyze information, interpret and classify data, or make long-term predictions based on the present and the past. An important category of those methods is Machine Learning (ML), which describes the feeding of given data into an algorithm such that the latter can then perform statistical inference on unseen data. The heart of any ML method is an algorithm that is able to extract the important features of given data, so-called training data, automatically choose the correct hyperparameters of a prescribed mathematical model, and outputs correct and useful statements to unseen data.

A very powerful and thus popular class of ML methods are Deep Neural Networks (DNN) that consist of several layers of - neurons - , each of which takes data as input, transforms it in a certain way, and sends the result to a neuron in the next layer. For example, DNNs can be used for image classification problems: A random image of an animal, e.g. a panda, is fed as an input into the network, and the output is supposed to classify the animal correctly, i.e. output the class -Panda.
Even though DNNs, due to their complexity, are able to reach high accuracies at even very difficult classification or regression problems, they are not without flaws, one of them being their lack of robustness to perturbations to input data. Take for example the image of the panda and assume that it is correctly classified by the network. If this image now is changed in a way that is barely noticeable to the human eye (i.e. a human being would still reckognize the exact same panda), there is no guarantee that the network still ouputs the class - Panda- instead of - Camel - or - Giraffe -. This lack of robustness can lead to catastrophic scenarios in applications such as autonomous driving or medical imaging, which ultimately limits the employability of these methods.

This project serves as an introduction to novel training algorithms for DNNs using constrained Langevin dynamics (stochastic differential equations known from the realm of physics). Incorporating stochastic noise into the training procedure, these methods aim to find network configurations that are more robust in the mentioned sense, and therefore more suited for applications in the real world. The project introduces the theoretical foundations and provides examples in the form of computational experiments.

Planned Impact

MAC-MIGS develops computational modelling and its application to a range of economic sectors, including high-value manufacturing, energy, finance and healthcare. These fields contribute over £500 billion to the UK economy. The CDT involves collaborations with more than a dozen companies and organisations, including large corporations (AkzoNobel, IBM, Dassault, P&G, Aberdeen Standard Investments, Intel), mid-size firms, particularly in the engineering and power sectors (NM Group, which provides monitoring services to power grid operators in 30 countries, Artemis Intelligent Power, the world leader in digital displacement hydraulics, Leonardo, a provider of defense, security and aerospace services, and Oliver Wymans, a management consultancy firm) and startups such as Brainnwave, which develops data-modelling solutions, and Opengosim which designs state-of-the-art and massively parallel software for subsurface reservoir simulation. Government and other agencies involved will include the British Geological Survey, Forestry Commission, James Hutton Institute, and Scottish National Heritage. Engagement will be via internships, short projects and PhD projects. BIS has stated that "Organisations using computer generated modelling and simulations and Big Data analytics create better products, get greater insights, and gain competitive advantage over traditional development processes". Our partners share this vision and are keen to develop deeper collaborations with us over the duration of the CDT.

Our CDT will achieve the following:

- Produce 76 highly skilled mathematical scientists and professionals, ready to take up positions in academia or in companies such as our partners. The students will have exposure to projects, modelling camps and high-level international collaborations.

- Deliver economic and societal benefits through student research projects developed in close collaboration with our partners in industry, business and government and other agencies.

- Create pathways for impact on computer science, chemistry, physics and engineering by involving interdisciplinary partners from Heriot-Watt and Edinburgh Universities in the supervision and training of our students.

- Organise a large number of lectures and seminars which will be open to staff and students of the two universities. Such lectures will inform the wide university communities about the state-of-the-art in computational and mathematical modelling.

- Work with other CDTs both in Edinburgh and beyond to organise a series of workshops for undergraduates, intended to foster an increased uptake of PhD studentship places in technical areas by female students and those from ethnic minorities, with potential impact on the broader UK CDT landscape.

- Organise industrial sandpits and modelling camps which offer the possibility for our partners to present a challenge arising in their work, and to explore innovative ways to tackle that challenge, fully involving the CDT students. This will kick-start a change in the corporate mindset by exposing the relevant staff to new approaches.

- Develop a new course, "Entrepreneurship for Doctoral Students in the Mathematical Sciences" in conjunction with Converge Challenge (Scotland's largest entrepreneurial training programme) and UoE's School of Business. This and other support measures will develop an innovation culture and facilitate the translation of our students' ideas into commercial activities.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023291/1 01/10/2019 31/03/2028
2278824 Studentship EP/S023291/1 01/09/2019 29/02/2024 Rene Lohmann