Towards a viable Artificial Intelligence companion for the healthcare management of Non-Alcoholic Fatty Liver Disease (NAFLD)

Lead Research Organisation: Newcastle University
Department Name: Sch of Computing


NAFLD is the unhealthy build-up of fat in hepatocytes in individuals with little-to-no-alcohol consumption. NAFLD encompasses a large spectrum of disease severity, with early-stage steatosis being largely benign, however latter stage fibrosis and cirrhosis being strongly linked to more severe liver conditions and mortality. A major challenge of NAFLD is the progression from benign to more serious stages are characterised by a substantial level of variability; it is therefore difficult to assess the characteristics that contribute to advanced stage progression and when this will occur. Another issue is the gold standard for NAFLD diagnosis is via liver biopsy, an incredibly invasive procedure, often not performed until latter, irreversible stages of the disease.

The high-level aim of this project is to develop Data Science and Machine Learning methods that support a precision medicine solution to NAFLD, fostering robust early diagnosis, accurate risk stratification and precise prognosis. Several attempts have been made previously by researchers to develop an AI companion for NAFLD, with a strong focus on the replacement of current invasive diagnostic measures for NAFLD using data obtained from routine clinical procedures and to identify novel combinations of biomarkers that can replace existing surrogate scores that indicate disease severity. These studies have typically used ML or DL algorithms upon data collected through non-invasive modalities, such as blood serum analysis. Despite some researchers yielding promising results, several limitations exist. Almost all studies have focuses upon diagnosis of NAFLD, with prognosis and time-to-event analysis being omitted. Data Science issues have also limited the findings for these studies: medical data is inherently flawed with datasets being characteristically scarce and sparse resulting in training sets used by ML algorithms being significantly smaller than the original cohort; interpretability issues of neural networks raise a trade-off between highly accurate but poorly understood algorithms; and also a lack of multi-omics data usage as well as a lack of heterogeneity in populations.

With the strengths and limitations of previous works in mind, we have at present 4 broad objectives we would like to achieve. The first is to integrate phenotypic multi-omics data and histological features of hepatic damage to identify subpopulations of patients with specific disease drivers. Secondly, we wish to develop predictive disease models on an individual and sub-population basis to identify the factors that contribute to the longitudinal progression of NAFLD, through combining traditional supervised ML and DL classification methods with survival analysis. Thirdly we hope to identify clinical biomarkers that are patient specific that can reflect the grade of steatosis, steatohepatitis (NASH) and stage of fibrosis - through this we aim be able to diagnose and prognose individuals via routine blood tests rather than biopsy. Our final objective is to compare the use of topological data analysis and machine learning for the classification of digitized biopsy images to determine benign steatosis of the liver from more serious NASH, fibrosis, and cirrhosis.

The project so far has utilised the European NAFLD registry, the largest NAFLD dataset in Europe which comprises of historic NAFLD cases as well as NAFLD cases acquired since 01/01/18 as part of the IMI-2 funded LITMUS project - an active study recruiting from >25 sites across 13 European countries. Patients within this registry are followed-up to 10 years from baseline to allow for longitudinal analysis of patients that are scattered across the NAFLD disease progression spectrum. The data collected includes an individual's clinical information, liver histopathology, biopsy samples amongst others. The use of multi-omics data from the registry is also applied to better understand inter-patients' variability of hepatic injury.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R51309X/1 30/09/2018 29/09/2023
2639670 Studentship EP/R51309X/1 30/09/2021 30/03/2025
EP/T517914/1 30/09/2020 29/09/2025
2639670 Studentship EP/T517914/1 30/09/2021 30/03/2025