Profiling criminal activity from digital behavioural traces

Lead Research Organisation: University College London
Department Name: Security and Crime Science

Abstract

Digital devices are now indispensable in almost all areas of daily life - from communication to fitness tracking, shopping to travel. The artefacts generated from the use of digital devices are likewise invaluable to law enforcement, revealed via digital forensic analysis.

However, the volume of data generated makes it difficult if not impossible to use standard manual analysis techniques. An emerging research area within digital forensic science is the use of machine learning algorithms for classification of forensic artefacts, and this research explores whether individuals engaged in particular types of criminal activity can be identified from automated analysis of their digital behavioural traces, based on analysis of past cases.

Using digital forensic data already extracted as part of criminal investigations, where the classification has been made by forensic analysts, the performance of various machine learning algorithms will be evaluated and compared to current methodology. Initially this will focus on a particular type of cybercrime, the problem of child abuse images on the Internet, and study whether high-risk offenders (those who commit contact abuse) can be distinguished from those who download or share images. This will be subsequently extended to other types of criminal activity, such as online fraud.

This research is inherently multidisciplinary, using computer science methodology but drawing on crime science to understand the mechanism, how to evaluate its impact and to place it in context, and on psychology to understand how indirect behavioural traces may contribute to classification. It falls under the cybersecurity aspect of the EPSRC global uncertainties programme, specifically the priority of risk identification, reduction, mitigation and management by looking at emerging uses of the Internet and the risks associated with them, and also contributes to the ICT priority of achieving an intelligent information infrastructure by tackling the problem of increasing volumes of forensic data, and using it to its fullest potential.

The research will build on an initial exploratory study which has just been completed, focusing on distinguishing contact and non-contact abusers with a dataset of 45 cases. This will repeat the analysis with a larger dataset (initially focusing on the same crime type) and a separate evaluation set, but also expand the feature engineering to determine the most useful features, and extract more meaningful features relating to search queries by extracting concepts from search terms, and web browsing history by categorising sites visited. Close collaboration with law enforcement will provide more data than has been available for similar studies in the past, allowing more in-depth analysis and the ongoing evaluation of results.

This research will have a practical benefit by increasing efficiency in the digital forensic workflow, reducing risks to suspects and victims caused by lengthy delays, and supporting standardised quality processes as mandated by the Forensic Science Regulator for digital forensic activity. It will also examine the degree to which computer usage patterns can indicate offline behaviours, and consider the ethical implications of this approach with respect to privacy and fairness.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509577/1 01/10/2016 30/09/2021
1801909 Studentship EP/N509577/1 28/09/2015 13/07/2021 Beverley Nutter