Robust machine learning: algorithms, numerical analysis and efficient software

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Mathematics

Abstract

In today's data-driven world there is an ever increasing need for statistical methods to analyze information, interpret and classify data, or make long-term predictions based on the present and the past. An important category of those methods is Machine Learning (ML), which describes the feeding of given data into an algorithm such that the latter can then perform statistical inference on unseen data. The heart of any ML method is an algorithm that is able to extract the important features of given data, so-called training data, automatically choose the correct hyperparameters of a prescribed mathematical model, and outputs correct and useful statements to unseen data.

A very powerful and thus popular class of ML methods are Deep Neural Networks (DNN) that consist of several layers of - neurons - , each of which takes data as input, transforms it in a certain way, and sends the result to a neuron in the next layer. For example, DNNs can be used for image classification problems: A random image of an animal, e.g. a panda, is fed as an input into the network, and the output is supposed to classify the animal correctly, i.e. output the class -Panda.
Even though DNNs, due to their complexity, are able to reach high accuracies at even very difficult classification or regression problems, they are not without flaws, one of them being their lack of robustness to perturbations to input data. Take for example the image of the panda and assume that it is correctly classified by the network. If this image now is changed in a way that is barely noticeable to the human eye (i.e. a human being would still reckognize the exact same panda), there is no guarantee that the network still ouputs the class - Panda- instead of - Camel - or - Giraffe -. This lack of robustness can lead to catastrophic scenarios in applications such as autonomous driving or medical imaging, which ultimately limits the employability of these methods.

This project serves as an introduction to novel training algorithms for DNNs using constrained Langevin dynamics (stochastic differential equations known from the realm of physics). Incorporating stochastic noise into the training procedure, these methods aim to find network configurations that are more robust in the mentioned sense, and therefore more suited for applications in the real world. The project introduces the theoretical foundations and provides examples in the form of computational experiments.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023291/1 01/10/2019 31/03/2028
2278824 Studentship EP/S023291/1 01/09/2019 31/08/2023 Rene Lohmann