Learning Sparse Features from 4D fMRI Data for Brain Disease Diagnosis

Lead Research Organisation: University of Sheffield
Department Name: Computer Science


Machine learning endows computers with the ability to learn from data to help solve real-world problems. Due to the growth of big data, machine learning methods have become increasingly important tools in a wide range of applications including bioinformatics, computer vision, economics, and medicine. This project investigates machine learning for extracting useful information from fMRI data to help clinicians make more accurate diagnoses for certain brain diseases and develop more effective treatments for them.

Currently, deep learning is the most popular machine learning method. However, it has highly complex architectures and needs vast amounts of data to learn a huge number of parameters. This leads to difficulties when the number of data examples available (n) is very small compared to the number of features in each data example (p), which is the "large p, small n" problem. Indeed, Geoff Hinton, the godfather of deep learning, said recently: "One problem we still haven't solved is getting neural nets to generalise well from small amounts of data".

Most existing solutions for the "large p, small n" problem represent data as vectors. With growing data dimensionality, such vector-based methods become inadequate for severe "large p, small n" problems, e.g., machine learning on fMRI data. fMRI data are sequences of 3D volumes, i.e., 4D data. They are noisy, big, and multidimensional, making comprehensive manual analysis infeasible and machine learning challenging. A typical whole-brain fMRI scan sequence has tens of millions features (voxel measurements), with a file size over 100MB. For such data, even a simple linear basis needs tens of millions parameters (deep learning will need far more) but in practice we often only have sequences for dozens of individuals available in a particular fMRI study due to high cost.

Therefore, we aim to develop a new machine learning method for severe cases of "large p, small n" for multidimensional data such as whole-brain fMRI. We will take a tensor-based approach, where a tensor refers to a multidimensional array. Tensor-based methods have a much smaller number of parameters than vector-based ones. For typical whole-brain fMRI data above, a tensor-based multilinear basis needs only a few hundreds parameters, several orders of magnitude smaller than those needed by a vector-based, linear basis. We will generalise the state-of-the-art sparse feature learning methods for vector input to tensor-based ones for tensor input.

This will be the first study to learn sparse features directly from tensor representations of multidimensional data in a scalable and interpretable way. We will apply our algorithms to a large fMRI dataset on attention deficit hyperactivity disorder (ADHD) to accomplish two major tasks: prediction and interpretation. Firstly, we will detect ADHD and classify its subtypes via a small number of automatically selected voxels. Secondly, collaborating with a brain imaging expert, we will analyse the connectivity of brain regions corresponding to selected voxels to interpret the classification results, gain insights, and identify biomarkers to assist clinicians in further diagnosis and treatment. Our results will be fully reproducible with the dataset in the public domain and our software to be released as open source. The success of this project will advance the state-of-the-art of machine learning and provide a new enabling software tool to applications with severe "large p, small n" problems such as medical imaging with high-cost scanners (e.g., MRI or 3D mammography machines) and translational bioinformatics with big genomic data.

Planned Impact

This research will contribute to strengthening the UK's world-leading position in both machine learning and brain imaging, with impact spanning four main areas.

A. Economy: Machine learning is an increasing economic driving force. With big data as the "fuel", machine learning is the engine turning big data into great power in artificial intelligence applications for the next industrial revolution. This project focuses on expensive but limited "fuel" on which the state-of-the-art machine learning methods are having difficulties. Its success will lead to significant cost saving in data collection such as fMRI scans (about 400 GBP per scan) and other expensive medical devices. Successful machine learning from a smaller amount of data will greatly help with the analysis of rare diseases or disorders such as prion disease, where it is impossible to get large numbers of subjects. It will also enable data exploitation for new data analytic problems such as genomic data analysis in translational bioinformatics for personalised medicines. On the other hand, this project will help brain disease diagnosis, which will lead to significant healthcare cost savings and reduce many other direct or indirect costs. For example, autism and dementia alone cost the UK economy 32 and 17 billion GBP per year, respectively, and they both can benefit from this project. In the long term, this project will also benefit pharmaceutical companies with interest in developing drugs for brain diseases.

B. Society: Machine learning and artificial intelligence have significantly impacted our society in the past few decades. Nowadays, our everyday life has been significantly changed by and heavily depends on machine learning, from shopping, socialising, entertaining, to healthcare, banking, job hunting. This project will advance machine learning technology and thus further drive the impacts of machine learning on improving our quality of life, with healthcare being the most direct area. It targets brain diseases, such as autism and dementia, which are major societal challenges affecting children and the elderly respectively. It will help clinicians better detect and classify such brain diseases, understand their causes and different patterns, and eventually find effective treatments for their patients. In general, it will also help the diagnosis and treatment of rare or new diseases where only a limited number of data examples are available.

C. Knowledge: This project will advance the state-of-the-art research in tensor-based machine learning and whole-brain fMRI data analysis, which are both areas of great importance. The findings of this project will also impact closely-related research fields including bioinformatics, computer vision, NLP, speech and language modelling, and even mathematics and statistics. Furthermore, while focusing on brain imaging applications, the method to be developed is general in nature like Lasso or Elastic Net so it will benefit researchers in many other areas of science and engineering in solving problems of similar nature. This project is interdisciplinary itself, which will foster more interdisciplinary works along this direction and beyond.

D. People: This project will have a positive impact on the careers of the PI, RA, and collaborator. We will all gain additional knowledge and experience in machine learning, tensor analysis, and brain imaging. Such exposure will help us build a larger network of potential collaborators for future funding proposals. It will advance the RA's career with additional skills and benefit the PI's students via small projects on related problems. Despite his recent success with Hong Kong grants, the PI has not yet been involved in UK-funded projects. Having an EPSRC first grant is a very important step in his academic career, helping him consolidate his new lecturer position and raise his internationally profile in machine learning.
Description 1. Sturm: A New Sparse Tubal-Regularized Multilinear Regression Method for fMRI

While functional magnetic resonance imaging (fMRI) is important for healthcare/neuroscience applications, it is challenging to classify or interpret due to its multi-dimensional structure, high dimensionality, and small number of samples available. Recent sparse multilinear regression methods based on tensor are emerging as promising solutions for fMRI, yet existing works rely on unfolding/folding operations and a tensor rank relaxation with limited tightness. The newly proposed tensor singular value decomposition (t-SVD) sheds light on new directions. In this work, we study t-SVD for sparse multilinear regression and propose a Sparse tubal-regularized multilinear regression (Sturm) method for fMRI. Specifically, the Sturm model performs multilinear regression with two regularization terms: a tubal tensor nuclear norm based on t-SVD and a standard L1 norm. We further derive the algorithm under the alternating direction method of multipliers framework. We perform experiments on four classification problems, including both resting-state fMRI for disease diagnosis and task-based fMRI for neural decoding. The results show the superior performance of Sturm in classifying fMRI using just a small number of voxels.

Published paper https://link.springer.com/chapter/10.1007/978-3-030-32692-0_30

2. DawfMRI: Improving Whole-Brain Neural Decoding of fMRI with Domain Adaptation

In neural decoding, there has been a growing interest in machine learning on whole-brain functional magnetic resonance imaging (fMRI). However, the size discrepancy between the feature space and the training set poses serious challenges. Simply increasing the number of training examples is infeasible and costly. In this paper, we proposed a domain adaptation framework for whole-brain fMRI (DawfMRI) to improve whole-brain neural decoding on target data leveraging pre-existing source data. DawfMRI consists of three steps: 1) feature extraction from whole-brain fMRI, 2) source and target feature adaptation, and 3) source and target classifier adaptation. We evaluated its eight possible variations, including two non-adaptation and six adaptation algorithms, using a collection of seven task-based fMRI datasets (129 unique subjects and 11 cognitive tasks in total) from the OpenNeuro project. The results demonstrated that appropriate source domain can help improve neural decoding accuracy for challenging classification tasks. The best-case improvement is 8.94% (from 78.64% to 87.58%). Moreover, we discovered a plausible relationship between psychological similarity and adaptation effectiveness. Finally, visualizing and interpreting voxel weights showed that the adaptation can provide additional insights into neural decoding.

Published paper: https://link.springer.com/chapter/10.1007/978-3-030-32692-0_31

3. Side Information Dependence as a Regularizer for Analyzing Human Brain Conditions across Cognitive Experiments

The increasing of public neuroimaging datasets opens a door to analyzing homogeneous human brain conditions across datasets by transfer learning (TL). However, neuroimaging data are high-dimensional, noisy, and with small sample sizes. It is challenging to learn a robust model for data across different cognitive experiments and subjects. A recent TL approach minimizes domain dependence to learn common cross-domain features, via the Hilbert-Schmidt Independence Criterion (HSIC). Inspired by this approach and the multisource TL theory, we propose a Side Information Dependence Regularization (SIDeR) learning framework for TL in brain condition decoding. Specifically, SIDeR simultaneously minimizes the empirical risk and the statistical dependence on the domain side information, to reduce the theoretical generalization error bound. We construct 17 brain decoding TL tasks using public neuroimaging data for evaluation. Comprehensive experiments validate the superiority of SIDeR over ten competing methods, particularly an average improvement of 15.6% on the TL tasks with multi-source experiments.

Published in AAAI2020: http://eprints.whiterose.ac.uk/154983/

4. Improving multi-site autism classification based on site-dependence minimisation and second-order functional connectivity

Autism spectrum disorder (ASD) has no objective diagnosis method despite having a high prevalence. Machine learning has been widely used to develop classification models for ASD using neuroimaging data. Recently, studies have shifted towards using large multi-site neuroimaging datasets to boost the clinical applicability and statistical power of results. However, the classification performance is hindered by the heterogeneous nature of agglomerative datasets. In this paper, we propose new methods for multi-site autism classification using the Autism Brain Imaging Data Exchange (ABIDE) dataset. We firstly propose a new second-order measure of functional connectivity (FC) named as Tangent Pearson embedding to extract better features for classification. Then we assess the statistical dependence between acquisition sites and FC features, and apply a domain adaptation approach to minimise the site dependence of FC features to improve classification. Our analysis shows that 1) statistical dependence between site and FC features is statistically significant at the 5% level, and 2) extracting second-order features from neuroimaging data and minimising their site dependence can improve over state-of-the-art classification results on the ABIDE dataset, achieving a classification accuracy of 73%.
Preprint: https://www.biorxiv.org/content/10.1101/2020.02.01.930073v1

5. A machine learning cardiac magnetic resonance approach to extract disease features and automate pulmonary arterial hypertension diagnosis
Aims: Pulmonary arterial hypertension (PAH) is a progressive condition with high mortality. Quantitative cardiovascular magnetic resonance (CMR) imaging metrics in PAH target individual cardiac structures and have diagnostic and prognostic utility but are challenging to acquire. The primary aim of this study was to develop and test a tensor-based machine learning approach to holistically identify diagnostic features in PAH using CMR, and secondarily, visualize and interpret key discriminative features associated with PAH.
Methods and results: Consecutive treatment naive patients with PAH or no evidence of pulmonary hypertension (PH), undergoing CMR and right heart catheterization within 48 h, were identified from the ASPIRE registry. A tensor-based machine learning approach, multilinear subspace learning, was developed and the diagnostic accuracy of this approach was compared with standard CMR measurements. Two hundred and twenty patients were identified: 150 with PAH and 70 with no PH. The diagnostic accuracy of the approach was high as assessed by area under the curve at receiver operating characteristic analysis (P < 0.001): 0.92 for PAH, slightly higher than standard CMR metrics. Moreover, establishing the diagnosis using the approach was less time-consuming, being achieved within 10 s. Learnt features were visualized in feature maps with correspondence to cardiac phases, confirming known and also identifying potentially new diagnostic features in PAH.
Conclusion: A tensor-based machine learning approach has been developed and applied to CMR. High diagnostic accuracy has been shown for PAH diagnosis and new learnt features were visualized with diagnostic potential.

Published: https://doi.org/10.1093/ehjci/jeaa001
Exploitation Route 1. Sturm (Sparse Tubal-Regularized Multilinear Regression): This work paves the way for further research on learning with sparsity and brain imaging. The proposed method can inspire further extensions and find potential applications in areas including machine learning and computer vision. It provides additional tools to neuroscientists and clinicians in analysing brain imaging data. We are also exploring its applications in other medical imaging domains.

2. DawfMRI (domain adaptation framework for whole-brain fMRI): This approach offers an important alternative in dealing with limited number of samples in brain imaging. It can lead to further methodological development in transfer learning for fMRI data and enable fuller utilisation of existing data resources. This will release the true potential of public data collection efforts, such as the UK Biobank, in advancing healthcare, neuroscience, and other scientific domains.

3 & 4. Domain-independent machine learning: This is an important tool in integrating data obtained from different sources where there is domain differences. This also allows us to incorporate domain/prior knowledge into our model to learn better features.

5. Tensor-based MRI machine learning pipeline: This pipeline has been extended to cardiac MRI. We have filed a patent, published a paper in a medical journal and obtained a grant of £639,873. This also helped the PI to grow his network, with several other projects in discussion.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

URL https://link.springer.com/chapter/10.1007/978-3-030-32692-0_30
Description MSc projects
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact Trained postgraduate students on AI for Medical Imaging.
Description Amazon Research Award 2018
Amount $71,000 (USD)
Funding ID N.A. 
Organisation Amazon.com 
Sector Private
Country United States
Start 04/2019 
End 03/2020
Description Developing a Machine Learning Tool to Improve Prognostic and Treatment Response Assessment on Cardiac MRI Data
Amount £639,873 (GBP)
Funding ID 159851 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 07/2019 
End 06/2021
Description Chris Cox (Department of Psychology, Louisiana State University) 
Organisation Louisiana State University
Country United States 
Sector Academic/University 
PI Contribution We have one manuscript released on bioRxiv and another submitted. Our collaboration is still in progress.
Collaborator Contribution Brain imaging expertise. Domain knowledge that we are lacking of. Dr Chris Cox is now with the Department of Psychology, Louisiana State University.
Impact Two papers published: MLMI19 and AAAI20.
Start Year 2018
Description Gaolang Gong - Beijing Normal University 
Organisation Beijing Normal University
Country China 
Sector Academic/University 
PI Contribution We develop novel machine learning algorithms.
Collaborator Contribution Prof. Gao helps us design the experiments and interpret the results.
Impact Paper preprint at https://biorxiv.org/cgi/content/short/2020.02.01.930073v1
Start Year 2019
Description Jun Ma - Amazon, US 
Organisation Amazon.com
Country United States 
Sector Private 
PI Contribution We develop new machine learning algorithms.
Collaborator Contribution Jun provides supervision to my PhD student and help us steer towards important directions. He also offers technical advice.
Impact One paper submitted, now under review. Co-organised a challenge.
Start Year 2019
Title Tensor based feature learning analysis approach to interrogate medical images 
Description Tensor based machine learning for interrogating medical images. 
IP Reference PCT/GB2019/053014 
Protection Patent application published
Year Protection Granted
Licensed No
Impact Improve diagnosis and prognosis of cardiovascular diseases and other diseases that can be helped with medical imaging.
Description Interpretable Machine Learning 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This event has about 200 delegates from academia, industry, hospitals, etc, and sparked interesting discussions with clinicians in STH during and after the talk.
Year(s) Of Engagement Activity 2019
URL https://insigneo.org/wp-content/uploads/Agenda-v15-Posters-Map-BackPage-underline-pdf-x-version.pdf
Description Interpretable Machine Learning - Insigneo showcase 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact I have been invited to give a talk in this event.
Year(s) Of Engagement Activity 2019
URL https://insigneo.org/is2019/