Investigating information flow through complex biological networks

Lead Research Organisation: University of Birmingham
Department Name: School of Computer Science


The aim of the current research project is to model complex systems observed in nature using annotated graph representations. We shall then develop novel methods for a low dimensional non-Euclidean embedding of the system, that aims to preserve both structural and attribute information. We hope that such embeddings are able to provide insights into the underlying systems at study but uncovering a hierarchy in their formation. We also aim to use these embeddings to better predict both missing links in the observed structure of the system as well as missing labels for the nodes within the graph.

Specifically, we will use methods derived from natural language processing to embed nodes close to similar nodes, while keeping them far apart from dissimilar nodes. We shall be the first work to adapt these techniques to embed to a Reimannian manifold in Minkowski Spacetime using attributes as well as topological structure.

Following this, we shall research the use of features that can be extracted from these systems - groups of highly similar subnetworks of nodes that form dense regions of the embedding space. We hypothesise that these subnetworks will be useful in the task of classifying samples - a task that would be difficult otherwise, due to the enormous search space of the original network. We shall use the inherantly hierarchical nature of the embedding space to convert these subnetworks into decision trees. Using networks to guide the construction of Random Forests is a very new development (Dutkowski and Ideker, 2011), however, our work will be the first to directly incorporate prior knowledge into the forest construction, via the annotation of nodes with attributes, and also the first method to use network representation learning techniques in this process.

Dutkowski, J., & Ideker, T. (2011). Protein networks as logic functions in development and cancer. PLoS computational biology, 7(9), e1002180.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509590/1 01/10/2016 30/09/2021
1816042 Studentship EP/N509590/1 26/09/2016 25/09/2019 David McDonald