An Active Learning Approach to Network Inference

Lead Research Organisation: University of Bristol
Department Name: Engineering Mathematics and Technology

Abstract

The understanding of pathway structures is crucial to our understanding of the functional organisation of genes and proteins. Abnormally functioning pathways underlie many human diseases. Given the extent of noise in biological datasets, and limited amounts of data available, unsupervised determination of network topology is a substantially under-determined inference problem. Consequently, in this project, we focus on network completion, which is a supervised learning task involving multiple types of data. Starting from a set of known links and non-links we construct a classifier which predicts and ranks further possible functionmal links for experimental validation. There will be a parallel experimental programme following an active learning approach to network inference, that is, predictions of functional links will be investigated which will, in turn, be used to improve the predictor. To provide a focus for the accompanying laboratory work, by the medical researchers associated with the project, our principal aim will be the discovery and validation of pathway structures associated with hypertension. Hypertension is the most common cause of preventable disease in the developed world.

Planned Impact

The project has a wide range of potential beneficiaries. The proposed innovations in machine learning, in multiple kernel and active learning, could be applied across a wide range of disciplines, from recognizing hand-written digits, to face identification, text categorisation, bioinformatics and database marketing, for example. Thus the proposed contributions have potential applications well beyond the indicated biomedical application. Other topics we cover are also of interest within machine learning, and associated application domains, such as network inference, data cleaning, outlier detection and learning with label noise, for example. The project would also be of interest to bioinformatics researchers and medical researchers interested in network inference. The inference of transcriptional regulatory networks is very important for advancing our understanding of the complex regulatory mechanisms within cellular systems: many diseases are associated with abnormally functioning pathways. Finally, the project would be of interest to medical researchers with an interest in the hypertensive state. Though the emphasis of the project is algorithmic innovation and a novel approach to network inference, we have necessarily had to focus on a specific biomedical context. Professor Murphy's prime interest is the hypertensive state and so the associated biological context is the understanding of the cellular regulatory networks associated with this condition. This, in itself, is a very important goal given that hypertension is a significant contributor to premature mortality and morbidity and, for the vast majority of cases (about 95%), the cause is unknown (essential hyperytension). Hypertension is a major econmic burden for society since it is a major cause of stroke, heart failure, renal
failure, blindness and myocardial infarction.

Publications

10 25 50
 
Description With Prof. David Murphy's group (University of Bristol) we used a graphical lasso algorithm to look for main regulators of hypertension (though prevalent, it remains poorly understood). With control vs disease-trait expression array data, nodes in the graph represent genes and a node with large fan-out may indicate that the expressed product of a gene has a significant regulatory influence. This identified CAPRIN2 as a gene of interest. As CI with David as PI, we have recently been awarded 1.3 million from BBSRC (BB/R016879/) for further investigation of this topic.
Exploitation Route In the context of grant BBSRC (BB/R016879/) we are further investigating this topic.
Sectors Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description Su-Yi Loh, Thomas Jahans-Price, Michael Greenwood, Mingkwan Greenwood, See-Ziau Hoe, Agnieszka Konopacka, Colin Campbell, David Murphy, and Charles Hindmarch. Unsupervised network analysis of the plastic supraoptic nucleus transcriptome predicts Caprin-2 regulatory interactions. eNeuro 0243-17 (2017). We discovered a main regulator (CAPRIN2) of hypertension in humans using a network inference method based on the graphical lasso algorithm.
First Year Of Impact 2017
Sector Healthcare,Pharmaceuticals and Medical Biotechnology
 
Title Sequence variant predictor (human disease) 
Description Available at http://fathmm.biocompute.org.uk. 
Type Of Material Model of mechanisms or symptoms - human 
Year Produced 2014 
Provided To Others? Yes  
Impact Plugin for highly used software tools such as Ensembl variant effect predictor VEP and COSMIC (Sanger Centre) 
URL http://fathmm.biocompute.org.uk