Novel statistical methods for detecting anomalies in data streams.

Lead Research Organisation: Lancaster University
Department Name: Mathematics and Statistics

Abstract

The low cost of sensors means that the performance of many mechanical devices, from plane engines to routers, is now monitored continuously. This is done in order to detect problems with the underlying device in order to allow for action to be taken. However, the amount of data gathered has become so large that manual inspection is no longer possible. This makes automated methods to monitor performance data indispensable.
My PhD focusses on developing novel statistical methods to detect anomalies, or untypical behaviour, in such data streams. More effective methods would allow to detect a wider range of anomalies, which in turn would allow to detect problems earlier, thus reducing their impact. Anomaly detection methods are also used for a range of other applications ranging from fraud prevention to cyber security.
Specific research questions include (i) how to differentiate between different types of anomaly; (ii) developing statistical algorithms that can scale to high-dimensional and high-frequency data streams; (iii) understanding the theoretical properties of the new statistical methods.

In partnership with BT.

This project lies with the area of Statistics and Applied Probability (Computational Statistics, Statistical Methodology, Time Series).

Planned Impact

The proposal will benefit (i) the UK economy and society, (ii) our industrial partners, (iii) the wider community of non-academic employers of doctoral graduates in STOR, (iv) the scientific disciplines of statistics and operational research (STOR) and associated academic communities, (v) UK doctoral students in STOR, and (vi) the CDT students themselves.

(i) The UK economy will gain a competitive edge through a significant increase in the supply of doctoral STOR professionals with the skills to achieve impact for their work, and who have been trained with the goal of becoming future leaders. Our goal is that those of our graduates who enter industry will assume leading roles in realising the major impact which STOR can make in achieving effective data driven decision-making. A wider societal benefit will accrue from research contributions, inter alia, to the EPSRC themes of Energy, Living with Environmental Change and Global Uncertainties.

(ii) Many of our industrial partners will benefit from the skills supply identified in (i), as likely future employers of STOR-i graduates. They further benefit from teaming with a community of leading edge STOR researchers in the solution of substantive industrial challenges. Mechanisms for the latter include doctoral projects co-funded by and co-supervised with industry, industrial internships and industrial problem-solving days. Our training programme will give students the skills they need to make sure that research outcomes are successfully communicated to beneficiaries. The value that our industrial partners place on working with STOR-i can be seen in over £5M of pledged support.

(iii) A wider benefit will accrue from the employment of STOR-i graduates, equipped as described in (i), across non-partner industrial, government and public sectors organisations. These will also benefit from the networking opportunities afforded by access to STOR-i events and from the dissemination of research outcomes accessibly within non-academic communities.

(iv) The STOR academic community will benefit from methodological advances and from the increase in supply of STOR researchers who value and have experience of collaborative research. Our recruitment strategy will further benefit this community in achieving a healthier supply of high quality doctoral candidates beyond STOR-i: our research intern programme gives top undergraduates from across the UK an experience of STOR research while STOR-i recruitment roadshows partner with the STOR community of the hosting institution. Experience with the current Centre has shown that both of these lead to an increase in applicants for STOR PhD programmes across the UK.

(v) Elements of the STOR-i programme will benefit the wider community of UK doctoral students in STOR. Using the financial support of our external partners, we will develop a STOR-i national associate scheme for UK STOR doctoral students working with industry. This will give funding and access to elements of the STOR-i training programme while an annual event will provide opportunities for learning, networking and sharing research progress to members of the scheme.

(vi) The STOR-i students will benefit from a programme which will support their growth toward research leadership, whether in academia or industry. They will be challenged to achieve their maximum scientific potential and also given the tools and opportunities to develop the broader skills which will enable them to achieve maximum scientific impact. They will be highly employable.

Publications

10 25 50
 
Title anomaly R package 
Description An implementation of CAPA (Collective And Point Anomaly) for the detection of anomalies in time series data. The package also contains Kepler lightcurve data and shows how CAPA can be applied to detect exoplanets. 
Type Of Technology Software 
Year Produced 2018 
Impact This software has been downloaded over 3000 times since its publication in June 2016. 
URL https://cran.rstudio.com/web/packages/anomaly/index.html