Anomaly Detection for Large Complex Data

Lead Research Organisation: Cardiff University
Department Name: Sch of Mathematics

Abstract

Large, complex, multi-variable and multiple data type data sources present a new challenge for anomaly detection as part of the statistical production process. Simple parametric models used for outlier detection in survey data are no longer suitable. They require model assumptions that would become prohibitively complex, are not efficient in processing large data sets, and do not allow for mixed variable types.

Anomaly detection in statistical production is key to ensuring the quality of statistics, and the challenge has not yet been fully addressed in official statistics. Working with ONS, the UK's national statistics institute, would offer the student access to sensitive, record-level data which is not usually easily available to researchers. Although some record-level survey data are available to academic researchers, non-survey data not collected by ONS is not generally accessible, and where it is, the environments are not usually suitable for big data processing. This project therefore offers the student the novel opportunity not only to work on datasets not usually available to academia, but also to do so in a state-of-the art distributed processing environment.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S513611/1 01/10/2018 30/09/2023
2109881 Studentship EP/S513611/1 01/10/2018 30/09/2022 Emily O'Riordan