Change detection in enterprise-wide computer network traffic

Lead Research Organisation: Imperial College London
Department Name: Mathematics

Abstract

Change detection in enterprise-wide computer network traffic

It is recognised by industry and the intelligence services that data science techniques have the potential to provide the next generation of cyber-security defences. Inside a typical enterprise computer network, a number of high-volume data sources are available which can enable the discovery and prevention of cyber-attacks and other nefarious network activity. Whilst traditional systems of cyber-defence focus on detecting strong signatures in packets, such as content-aware firewalls and antivirus software, there is a largely under-exploited opportunity to use more statistical, probabilistic model-based techniques for identifying more subtle intrusion attempts. The potential advantage of such approaches is the ability to learn, from historical data, normal patterns of computer and network behaviour, so that anomalies can then be detected which would not stand out otherwise; one example is unusual network traversal using legitimate credentials.

Interest here will focus on detecting significant temporal changes in the probability distribution of large-scale summary statistics, such as the relative popularity of different service ports, the countries connected to by computers within an enterprise, or possibly some other local characteristic of the network. The preferred approach will be Bayesian model-based changepoint analysis, which requires computational simulation techniques such as Markov chain Monte Carlo. Any methods developed will need to be scalable for realistic deployment across an entire network. Some exploratory data analysis using big data platforms will be necessary for identifying the main structures in the data, to guide the model-building process.

The project is aligned to the EPSRC Global Uncertainty and Digital Economy strategic themes, and the Statistics and Applied Probability research theme.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513052/1 01/10/2018 30/09/2023
2129730 Studentship EP/R513052/1 01/10/2018 30/06/2022 Karl Hallgren
 
Description The main objective of our research has been to develop Bayesian changepoint detection methods that are motivated by cyber security applications. We have been interested in identifying key features of cyber data that challenge traditional changepoint detection methods, and in building tractable changepoint models that are adapted to these features. At this stage, two novel Bayesian changepoint models have been proposed.

The first project stems from the commonly held belief of security experts that cyber attacks tend to correspond to coordinated sequences of events across multiple connected endpoints on the network. For example, some attacks are most likely to be identified through a chain of behavioural changes across related activity types on the same machine, or across machines linked by network connectivity. In other words, there is a requirement for changepoint detection methods that combine evidence from multiple sources to detect changepoints, whilst taking into account which structures of changepoints are a priori likely to be of interest. To address this problem, we proposed a flexible Bayesian graphical model for dependent changepoints across time series: given a graph G that describes which pairs of time series are likely to be simultaneously detected by changepoints, changepoints are modelled by means of an undirected graphical model. The proposed changepoint model borrows strength across clusters of connected time series in G to detect weak signals for synchronous changepoints. The benefits of the proposed model were demonstrated via a changepoint analysis of real network authentication data from Los Alamos National Laboratory (LANL), with some success at detecting weak signals for network intrusions across users that are linked by network connectivity, whilst limiting the number of false alerts.

The second project emanates from the lack of robustness of traditional changepoint detection methods to normal dynamic phenomena one may observe in temporal data. Cyber data are often subject to gradual population changes, seasonal variations and other temporal trends that are unlikely to be evidence for cyber attacks. Most traditional changepoint detection methods rely on partitioning the passage of time into segments and fitting relatively simple models, that assume the data are exchangeable within each segment. These changepoint detection methods fail to capture temporal dynamics and consequently fit many more changepoints. Therefore, there is need for methods to detect clear discontinuities in the presence of smooth but unpredictable temporal variability. To address this problem, we proposed a novel Bayesian changepoint model that is robust to some forms of m-dependence.
Exploitation Route The benefits of the proposed models could be demonstrated for various domains of application.
Sectors Chemicals,Security and Diplomacy,Other