Chain Event Graphs and Applications to Longitudinal Studies

Lead Research Organisation: University of Warwick
Department Name: Statistics

Abstract

Chain Event Graphs (CEGs) are a form of statistical model and deductive reasoning tool, and are a rapidly growing research field with a wide range of applications. This project aims to further develop on the existing theory and software associated with CEGs and their dynamic counterparts. These applications can also be explored, with CEGs already being used in fields such as public health, forensic science, tourism and criminal radicalisation.

I have already analysed a dataset on the treatment of early epilepsy and single seizures. This work focused on analysing the probability of a tonic-clonic seizure occurring within 1 year, dependent on the individual's baselines covariates and whether they received treatment with anti-epileptic drugs or not. This investigation opened up further areas of research on new approaches to analysis, and potential improvements to existing methods.

One of these new approaches is to incorporate continuous data or potentially infinite discrete data and response variables. For example, time between seizures is continuous and has been considered in Dynamic CEGs (DCEGs). Number of seizures in a period will be modelled by assuming the number of seizures suffered in a year follows a Poisson distribution. Currently, there is very little work being done on incorporating such data, except for including holding times in DCEGs, which is only one possibility.

There is also scope for improving existing model search algorithms. One of the strengths of CEGs compared to other graphical models is their ability to admit representations that are not necessarily variable based, and thus the model spaces themselves can't be represented as a product space. This leads to much larger model spaces that can become infeasible to search efficiently. Improvements must be made to the model search algorithms to enable them to search these larger model spaces, and to make them easily scalable. These improvements are vital if the theory and usage of DCEGs and Non-Stratified CEGs is to be furthered, as they generally require model spaces with very few restrictions and thus can be extremely large. My previous work has shown the limitations of using the popular Agglomerative Hierarchal Clustering (AHC) model search algorithm, particularly when dealing with low counts of data, and I plan to investigate these issues more. There is also significant research to be done on the scores used for these search functions. Marginal likelihood is the general default in the literature, but other criteria such as BIC can be used; the stagedtrees package in R uses BIC. However, there is very little existing literature on using these alternative scores.

In order to further the development of CEGs, the existing software and packages must be made user friendly, particularly for the purposes of fitting and graphing CEGs and DCEGs. It is time consuming and difficult to plot CEGs with the existing software when it comes to exploring possibilities or presenting results. In addition, one of the main R packages focused on CEGs, ceg, has several bugs and is not actively maintained. Improving this software would enable wider adoption of CEGs.

Sparse edge counts and missing data also require study, as they cause problems in the analysis and interpretation of datasets, and also in the aforementioned model search algorithms. There is also significant scope for development of more robust forms of variable selection. There is very little published theory for CEGS on variable selection methods. Variables are often selected based on existing studies or the advice of domain experts. Existing variable selection methods for other forms of statistical modelling, such as Generalised Linear Models (GLMs), are a natural starting point.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/V520226/1 30/09/2020 31/10/2025
2440874 Studentship EP/V520226/1 04/10/2020 04/10/2024 Conor Hughes