Cybercrime pathways on underground gaming forums

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

Cybercrime pathways are the common steps taken by offenders leading up to their first and successive offences. Understanding pathways is useful for identifying potential interventions to deter future offenders. Reports by the National Crime Agency have identified online gaming to be a possible entry point into cybercrime. For example, underground discussion platforms provide a centralised meeting place where gamers can interact with, and be influenced by, cybercriminals. There has been little research into the specific role of gaming as an entry point into cybercrime, specifically in underground hacking forums.
In addition to forums, there are indications that actors are moving towards other discussion platforms, including Discord (which is aimed specifically at gamers). While I am aware Discord is being used for gaming-related cybercrime discussions, there has not yet been any research published on this. I will use the CrimeBB dataset, which is available for academic research from the Cambridge Cybercrime Centre. As well as Discord data, CrimeBB contains over 70 million posts written by 1.6 million users scraped from 16 underground forums platforms, dating back as far as 2002.
The variation of dataset contents in different forums and across platform types results in different feature sets being available. Additionally, the scale of the datasets, with millions of users interacting over many years, produce additional challenges, which require the use of big data approaches to reduce processing time. This research spans disciplines including applied machine learning and statistical techniques, natural language processing (NLP), and studies into criminal activity on these platforms.
First, I will look at how existing techniques used for analysis can be adapted for time-series based analyses. For example, the use of NLP tools to model the progression of cybercrime activity over time will need to account for the changing lexicons used by platform members. Furthermore, offenders are adaptive, modifying their activities in response to new opportunities and business models. A framework will be created to synthesise knowledge of existing analysis techniques and tools applicable to each part of the data analysis pipeline, and identify where new approaches are required, particularly for modelling the data longitudinally. The framework will include a discussion of prediction and validation methods, as a balance between complexity and interpretability is needed to understand results.
Second, in addition to being able to model the changing cybercrime landscape and lexicons, I will adapt existing tools, and create new ones that are generalisable across platform types. There will be challenges to overcome due to the varying sizes and representations of platform datasets. Tools will be optimised using machine learning and big data approaches to improve processing time and efficiency. Creating useful toolkits is a key challenge, as the results of analysis techniques depend upon their quality.
Third, I will apply the tools I develop to longitudinally analyse the development of cybercrime pathways on the discussion platforms, to infer behaviour characteristics of groups. Incremental or time-series machine learning models will be applied to model the changing behaviours and social networks of key actors on the forum. This will aid later research into gaming-related pathways, by creating interpretable models for identifying changing trends for progression of escalating cybercrime activity.
As the research involves the analysis of human behaviours, it will require review by the department's ethics committee. It not possible to gain informed consent from all members (this would be considered spamming). However, as this work analyses collective behaviours rather than identifying individuals, under the British Society of Criminology's Statement of Ethics, it falls outside the requirement of informed consent.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509620/1 01/10/2016 30/09/2022
2276284 Studentship EP/N509620/1 01/10/2019 31/12/2022 Jack Hughes
EP/R513180/1 01/10/2018 30/09/2023
2276284 Studentship EP/R513180/1 01/10/2019 31/12/2022 Jack Hughes