Gene coexpression network for the study of Rhizobium leguminosarum

Lead Research Organisation: University of Oxford
Department Name: Statistics

Abstract

Description and impact:
Rhizobium leguminosarum is a bacterium that associated with legumes fixes atmospheric Nitrogen. The ammonia salts produced by the bacteria are consumed by the plants they are associated with.
Knowing this process and all the genes and proteins involved is fundamental in order to improve
crops' growth. This improvement will lead to sustainable agriculture and to reduce fertilizer inputs
as well as CO2 and N2O emissions.
Aims and objectives:
The objective of the project is to carry out a statistical and network analysis of cutting-edge gene
expression and global mutational data. The network analysis will enable genes needed for superior symbiotic performance to be identified and introduced into commercial inocula. These inocula are used both in the UK and around the world, enabling substantial yield gains and contributing to sustainable agriculture with reduced fertilizer inputs as well as reduced CO2 and N2O emissions.
Novelty of the research methodology
To create the gene co-expression network of the bacteria, we are using gene coexpression data
(microarrays) from the bacteria under different growth conditions. This data is extremely rich and noise. In order to get as much precise information as possible from the original data, we have employed different data-preprocessing techniques. Some examples are the use of different normalization procedures (eg. quantile normalization) and the removal of the lowest expressed values from each microarray. In addition, we also studied the effect of excluding from the analysis different "genes" such as pseudogenes and genes that were not included across all the microarrays.
The main idea of the analysis is calculating the correlation between the expression of each pair of studied genes. We then imposed a threshold to select only the strongest relationships. We employed different well-known correlation measures as Pearson, Kendall and Spearman correlation. To select which method works the best, we use Monte Carlo-based methods. We use biological information available in databases as KEGG, BioCyc, OperonDB and STRING to select biological-related groups of genes. We compute the number of edges between each group and we compare that value
the result of taking random genes from the network.
Lastly, to identify interesting groups of genes, we perform community detection on the network (Louvain Method - Configuration model). We have been able to detect clusters of genes involved in the same biological process (eg. metabolic pathways). The long-term objective is use to use this approach to find in the network those genes that are related with the nitrogen fixation process.
Alignment to EPSRC's strategies and research areas:
- Living With Environmental Change (LWEC): By using Rhizobium inocula as fertilizers, the fertilizer inputs, as well as CO2 and N2O emissions, would be reduced. That would be useful in order to deal with environmental change.
- Mathematical sciences: In order to develop the interaction network, we are using and developing different mathematical and statistical tools that would be useful to perform equivalent studies.
Companies or collaborators involved:
Nottingham based Legume Technology Ltd, Units 3C & 3D, Eastbridgford Business Park, Kneeton Rd, Eastbridgford, Notts, NG13 8PJ. Legume Technology Ltd will provide advice from the beginning of the project onwards

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R512333/1 01/10/2017 30/09/2021
1950255 Studentship EP/R512333/1 01/10/2017 30/09/2021 Javier Pardo Diaz
 
Description We are aiming to generate a gene coexpression network for Rhizobium leguminosarum. In this network, the nodes are the genes and they are connected if they are coexpressed across the different samples we have in our input. The input is a collection of microarrays which measure the expression of each gene. The state of the art method to generate gene coexpression networks is based on the use of the Pearson correlation. We have found out that distance correlation retrieves better results than Pearson correlation when generating gene coexpression networks. The networks based on distance correlation are more stable and capture more biological information.

We have constructed a gene coexpression network for Rhizobium leguminosarum using distance correlation. Using this network we have been able to identify groups of genes that are involved in the same molecular processes.
Exploitation Route We are preparing an R package which allows other users to generate their own gene coexpression networks using our methodology.
We are also planning to submit the current results and the code used so that they are open to the scientific community.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Environment,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description 21st Congress on Nitrogen Fixation - 10th-15th Oct 2019, Wuhan, China - Philip Poole 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Phil gave a talk at this international conference. He had many questions on his work and spent time exchanging ideas with colleagues in this research area.
Year(s) Of Engagement Activity 2019
URL http://2019icnf.csp.escience.cn/dct/page/65580
 
Description ComplexNetworks 2019 Conference Poster presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the 8th conference on Complex Networks and their Applications
Year(s) Of Engagement Activity 2019
URL https://www.complexnetworks.org/
 
Description ISMB/ECCB 2019 Conference Poster presentation and short talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation of a poster and a short talk at the ISMB/ECCB 2019 Conference
Year(s) Of Engagement Activity 2019
URL https://www.iscb.org/ismbeccb2019
 
Description International COSTNET18 Conference Poster presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Study participants or study members
Results and Impact Presentation of a poster in the COSTNET18 Conference
Year(s) Of Engagement Activity 2018
URL http://costnet18.wzim.sggw.pl/
 
Description International COSTNET19 Conference Poster presentation and short talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation and short talk given at the COSTNET19 Conference
Year(s) Of Engagement Activity 2019
URL https://costnetbilbao.wordpress.com/
 
Description Keble College Graduate Discussion Evening 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Presentation about my research to members of Keble College
Year(s) Of Engagement Activity 2020
 
Description Oxford Networks Seminar talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Talk at the Oxford Networks Seminar
Year(s) Of Engagement Activity 2019
URL https://www.maths.ox.ac.uk/groups/networks/networks-seminar
 
Description VII International Symposium SRUK/CERU Poster presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at the VII International Symposium SRUK/CERU
Year(s) Of Engagement Activity 2019
URL https://sruk2019.com/