Tata Steel Defect Detection Project

Lead Research Organisation: Swansea University
Department Name: College of Science


The project will explore the use of active learning frameworks in the training of a defect classification system for identifying surface defects observed from imaging systems. The questions to explore will include whether a human-in-the-loop system can be used to utilise domain expert knowledge and iterative machine learning to produce a robust and accurate system for recognising a wide range of complex classes. The project will also look into underlying methods of supervised and unsupervised machine learning, in order to identify which approaches provide reliable but efficient approaches for the labelling problem. One key question to be answered is whether the relationships between the observations can be utilised to guide and inform the user during their labelling of the data, and whether this is then reflected in the system's performance.

The approaches used:

The project will first implement a labelling system with input from a group of end users and previous literature. The labelling system will allow them to take in images and provide an effective method for labelling a small selection of the observed data, before then training supervised machine learning models to predict labels for the remaining images. The user will then enter into an active learning loop where they update the labels, retrain the model and review the predictions. This then repeats until the user is satisfied, or another criterion is met. The project will then look to implement an underlying data-structure which allows relationships between the observed samples to be represented. This data-structure will then be used to allow a machine learning model to be trained which takes not only the appearance of the image into consideration, but also the underlying relationships between the samples. Graph-based deep learning approaches will be used to then generalise the underlying domain of the problem. This graph-based approach can then be used for both predictive and generative models, and also in the visual presentation of the problem back to the domain user for further inspection.

Novel content:

The project novelty comes from understanding the underlying relationships between the images, and using them to guide and strengthen the active learning framework. It should identify key approaches in the active learning community, and should also provide methods for utilising the graph-based information for visualisation and interpretation of model activity.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S021892/1 01/04/2019 30/09/2027
2284469 Studentship EP/S021892/1 01/10/2019 30/09/2023 Connor Clarkson