Chain Event Graphs and Applications to Longitudinal Studies

Lead Research Organisation: University of Warwick

Department Name: Statistics

Abstract

Chain Event Graphs (CEGs) are a form of statistical model and deductive reasoning tool based on staged trees, a coloured probability tree, and are a rapidly growing research field with a wide range of applications. This project aims to further develop on the existing theory and software associated with CEGs and their dynamic counterparts. These applications can also be explored, with CEGs already being used in fields such as public health, forensic science, tourism and criminal radicalisation.

I have already analysed a dataset on the treatment of early epilepsy and single seizures. This work focused on analysing the probability of a tonic-clonic seizure occurring within 1 year, dependent on the individual's baselines covariates and whether they received treatment with anti-epileptic drugs or not. This investigation opened up further areas of research on new approaches to analysis, and potential improvements to existing methods.

One of these new approaches is to incorporate continuous data or potentially infinite discrete data and response variables. For example, time between seizures is continuous and has been considered in Dynamic CEGs (DCEGs). Number of seizures in a period will be modelled by assuming the number of seizures suffered in a year follows a Poisson process. Currently, there is very little work being done on incorporating such data, except for including holding times in DCEGs, which is only one possibility. As such, I will extend CEGs to a new form called a Poisson CEG (PCEG), where the response variable will be assumed to come from a Poisson process, and I will delve into the theory, methods, and applications of this new model.

In many of the applications where the response variable could be assumed to come from a Poisson process, there are a greater number of observations of zero, zero counts, than would be expected. This is an example of zero-inflation, and can be modelled with a zero-inflated Poisson (ZIP) distribution. This is centred on the idea that not all zeroes are created equal; some individuals will never have a nonzero count, and thus considered "risk free", while others may have a zero count but still be "at risk" and would have a nonzero count if observed for long enough. The ZIP aims to estimate the proportion of at risk individuals and subsequently estimate their underlying rate, which would be underestimated if zero-inflation was not accounted for. I will discuss methods used to incorporate this zero-inflation into the CEG through the introduction of a latent risk state variable, which extends the PCEG to a Zero-inflated Poisson CEG (ZIPCEG).

As the number of covariates increases and thus the size of the tree grows, this leads to sparse edge counts in the later parts of the tree, particularly when the overall sample size is insufficient. These parse edge counts can lead to spurious and unreliable conclusions. I will propose various methods to address and alleviate sparse edge counts, culminating in the novel intermediate CEG, where conditional independence relations are asserted in order to decrease the size of the tree. These methods will be demonstrated using real world data.

In order to further the development of CEGs, the existing software and packages in R must be made user friendly, particularly for the purposes of fitting and graphing CEGs. One of the main R packages focused on CEGs, ceg, has several bugs and is not actively maintained. I plan to publish my own package, pcegr, which uses the graphing methods of ceg but with the functionality to fit CEGs, PCEGs and ZIPCEGs, as well as perform variable discretisation.

Student:

Conor Hughes

Period of Study:

Oct 20 - Oct 24

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2440874

Research Topic:

Unclassified

Organisations

University of Warwick (Lead Research Organisation)

People	ORCID iD
Jane Hutton (Primary Supervisor)	http://orcid.org/0000-0003-0963-9997
Conor Hughes (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/V520226/1			30/09/2020	31/10/2025
2440874	Studentship	EP/V520226/1	04/10/2020	04/10/2024	Conor Hughes

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects