Theory and Applications of Deep Neural Networks

Lead Research Organisation: University of Southampton
Department Name: Electronics and Computer Science

Abstract

Most of the real-world challenges turn out to be pattern classification problems. Pattern classification algo-
rithms can bring us closer to better understanding the scientific questions we have about living organisms,
intelligence and nature itself. Most of the challenges we face in the 21st century have a great combinatorial
complexity that require performant learning algorithms that can make efficient use of computational re-
sources. The rapid increase in the amount of data available today pushes the upper limits of our traditional
algorithms and hardware and requires us to come up with intelligent methods to process, understand and
extract meaningful information from data.
One of the most promising approaches is the use of Artificial Intelligence, mainly through Machine Learning,
to solve real-world practical applications. The last decades saw the re-emergence of interest in Artificial
Neural Networks (ANNs), with considerable research being published on Convolutional Neural Network
techniques and Deep Learning. Such techniques managed to push the performance of image classiffication
problems to a near-human level.
Articial Neural Networks are a way of learning a mapping function between a set of input data and a
corresponding label. Internal parameters, known as weights, are adjusted in an iterative learning process
that converges the model to a statistically acceptable solution for predicting labels on previously unseen
data. Examples for which the label is known are shown to the system in what is known as the training
phase of the model. The error value between the predicted label and the true label is calculated using a
loss function and through a process known as backpropagation, the system updates the internal weights to
minimize the loss function. In general, training an Artificial Neural Network is time consuming and can
even take weeks when considering problems with large datasets. On the other hand, once a statistically
acceptable mapping function is found, it is very fast to compute a prediction for a previously unseen input.
For this reason, improving the computational efficiency during the training phase of a neural network still
represents a crucial active area of research.
Advances aimed at improving the computational efficiency during the training phase of the system were
made after examining different Neural Network architectures that use different kinds of activation functions,
adaptive learning rates and a variety of regularization techniques. Surprisingly, some of these techniques
turned out to bring great improvements in terms of training a neural network, despite their simplicity. A
recent example that comes to mind is the dropout technique in which internal units and their connections are
randomly dropped out to reduce the problem of overtting and make the system generalize better. Scaling
up Deep Learning is concerned with coming up with improvements to current algorithms and architectures
to achieve faster convergence of neural networks. The most promising techniques are aimed at taking full
advantage of decades of advances in computing hardware with the aim of using distributed computing with
multiple CPUs and GPUs to perform high-speed calculations in a parallelizable fashion. As it turns out,
this is no easy task, as it is often difficult to perform a task decomposition that splits a neural network into
modules that are trained in parallel, independent of each other.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509747/1 01/10/2016 30/09/2021
1953199 Studentship EP/N509747/1 01/10/2017 20/04/2018 Gabriel Stratan