2020BBSRC-NSF/BIO: Linking Mass Spectrometry Computational Ecosystems to Enhance Biological Insights of Publicly-Available Data

Lead Research Organisation: University of Liverpool
Department Name: Biochemistry & Systems Biology

Abstract

All biological systems are composed of important chemicals which help the systems grow, respond to the environment and communicate. The genome is composed of DNA which carries genetic information required for reproduction, growth and development and whose composition remains relatively static throughout a lifetime. The metabolome is composed of metabolites derived from food (e.g. sugars) and the environment (e.g. prescribed drugs) and whose composition is dynamic in relation to which metabolites are present and at what concentration. The scientific research community studies metabolites to understand how we metabolise food and drugs, how we respond dynamically to the environment and also to identify metabolites which are important for a biological system to function. Many of these investigations are discovery studies which apply a scientific technique called metabolomics that can detect hundreds or low thousands of different metabolites in a biological sample. These studies discover new metabolites which then have to be analysed to understand their chemical structure which is essential for biological interpretation.
Many metabolomic studies performed across the world are being released to the scientific community so that these data can be re-used and re-analysed to derive new biological information. The data from these studies are stored in data repositories and two examples of these are MetaboLights in the UK and GNPS in the USA. However, many of the metabolites detected and reported in these data repositories do not have a chemical identity assigned to them. Therefore there is an essential requirement to assign chemical structures to as many metabolites as is possible so that the data can be reused and biological information derived from the data. The planned project will further develop and apply computational approaches to all data deposited in MetaboLights and GNPS. This will allow more metabolites to be assigned a chemical identity and for the confidence that the correct structure has been assigned to be increased. When completed, the volume and quality of biological information available in the deposited datasets will be much greater and will allow new research questions to be asked and answered without the need to collect new data.

Technical Summary

The chemicals present in a biological system play many important roles including in reproduction, growth and survival. DNA contained in the genome provides a recipe for reproduction and is relatively static across an organism's lifetime. In comparison, metabolites contained in the metabolome are very dynamic in their presence and concentration in response to perturbations from within the organism or in response to external stimuli. Many studies to investigate the dynamics of metabolites apply a discovery-based approach called metabolic phenotyping (or metabolomics) where a chemical assay is applied to collect as great a volume of biological information as possible. However, the chemical structure or biological identity of many metabolites are not known prior to data collection and have to be derived from the data collected. This process of metabolite annotation is a significant hurdle because we do not yet have reference metabolomes and many metabolites are unavailable as chemical standards from which libraries metabolite identification can be applied. Many of the data deposited in metabolomics data repositories do not have an assigned chemical structure or metabolite name and therefore biological knowledge cannot be derived from these data. In the proposed research we will develop new open access computational tools and integrate these with existing open access tools to significantly increase the number of metabolites identified in two data repositories, MetaboLights in the UK and GNPS in the USA,. It will also establish common data standards for capturing and curating metabolomics data and for data exchange between the repositories. These approaches will increase the confidence in the annotations provided and greatly enhance the reusability of these data by the global scientific community.

Publications

10 25 50
 
Title Additional file 2 of Single sample pathway analysis in metabolomics: performance evaluation and application 
Description Additional file 2. Excel spreadsheet containing results of statistical testing of benchmarking results. Mann Whitney U tests were performed for each pairwise combination of methods tested in the effect size simulation (corresponds to Fig. 5 in the main text). P-values were adjusted using Bonferroni FWER correction. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Single_sample_pathway_anal...
 
Title Additional file 2 of Single sample pathway analysis in metabolomics: performance evaluation and application 
Description Additional file 2. Excel spreadsheet containing results of statistical testing of benchmarking results. Mann Whitney U tests were performed for each pairwise combination of methods tested in the effect size simulation (corresponds to Fig. 5 in the main text). P-values were adjusted using Bonferroni FWER correction. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Single_sample_pathway_anal...
 
Description Hot Topics debate on metabolite annotation - co/chair 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The MEBO Hot Topics debate focused on discussions about current state-of-the-art methods for metabolite identification in metabolomics and included four keynote speakers and a 2 hour debate. I acted as co/chair of the debate and organiser.
Year(s) Of Engagement Activity 2023
 
Description Keynote presentation - Scottish Metabolomics Network 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Keynote presentation to discus metabolite annotation in metabolomics and including outputs from current grants
Year(s) Of Engagement Activity 2022
URL https://static1.squarespace.com/static/57df9bff46c3c466ad42bb3c/t/635bfe71e1936e1e0c99827a/166697330...
 
Description Oral presentation at the International Metabolomics Society 2023 conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact A 20 minute oral presentation was provided to approximately 150 attendees of the conference hosted in Niagara Falls, Canada. This was a scientific presentation to researchers from across the world to publicise the availability of the computational resources for both LC-MS and NMR data. Significant numbers of questions and enquyiries arose from the presentation.
Year(s) Of Engagement Activity 2023
URL https://www.metabolomics2023.org/
 
Description Pint of Science - Liverpool 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact I provided a 20 minute discussion on the role of metabolomics in biological research
Year(s) Of Engagement Activity 2022
 
Description Poster - metabolomics 2022 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Poster presentation at Metabolomics 2023 describing both grants
Year(s) Of Engagement Activity 2022
URL https://www.metabolomics2022.org/
 
Description Scientific meeting - presentation (Thermo Scientific invited speaker) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited 25 minute presentation on metabolite identification at a Thermo Scientific customer science day
Year(s) Of Engagement Activity 2022