2020BBSRC-NSF/BIO: Linking Mass Spectrometry Computational Ecosystems to Enhance Biological Insights of Publicly-Available Data

Lead Research Organisation: University of Liverpool

Department Name: Biochemistry & Systems Biology

Abstract

All biological systems are composed of important chemicals which help the systems grow, respond to the environment and communicate. The genome is composed of DNA which carries genetic information required for reproduction, growth and development and whose composition remains relatively static throughout a lifetime. The metabolome is composed of metabolites derived from food (e.g. sugars) and the environment (e.g. prescribed drugs) and whose composition is dynamic in relation to which metabolites are present and at what concentration. The scientific research community studies metabolites to understand how we metabolise food and drugs, how we respond dynamically to the environment and also to identify metabolites which are important for a biological system to function. Many of these investigations are discovery studies which apply a scientific technique called metabolomics that can detect hundreds or low thousands of different metabolites in a biological sample. These studies discover new metabolites which then have to be analysed to understand their chemical structure which is essential for biological interpretation.
Many metabolomic studies performed across the world are being released to the scientific community so that these data can be re-used and re-analysed to derive new biological information. The data from these studies are stored in data repositories and two examples of these are MetaboLights in the UK and GNPS in the USA. However, many of the metabolites detected and reported in these data repositories do not have a chemical identity assigned to them. Therefore there is an essential requirement to assign chemical structures to as many metabolites as is possible so that the data can be reused and biological information derived from the data. The planned project will further develop and apply computational approaches to all data deposited in MetaboLights and GNPS. This will allow more metabolites to be assigned a chemical identity and for the confidence that the correct structure has been assigned to be increased. When completed, the volume and quality of biological information available in the deposited datasets will be much greater and will allow new research questions to be asked and answered without the need to collect new data.

Technical Summary

The chemicals present in a biological system play many important roles including in reproduction, growth and survival. DNA contained in the genome provides a recipe for reproduction and is relatively static across an organism's lifetime. In comparison, metabolites contained in the metabolome are very dynamic in their presence and concentration in response to perturbations from within the organism or in response to external stimuli. Many studies to investigate the dynamics of metabolites apply a discovery-based approach called metabolic phenotyping (or metabolomics) where a chemical assay is applied to collect as great a volume of biological information as possible. However, the chemical structure or biological identity of many metabolites are not known prior to data collection and have to be derived from the data collected. This process of metabolite annotation is a significant hurdle because we do not yet have reference metabolomes and many metabolites are unavailable as chemical standards from which libraries metabolite identification can be applied. Many of the data deposited in metabolomics data repositories do not have an assigned chemical structure or metabolite name and therefore biological knowledge cannot be derived from these data. In the proposed research we will develop new open access computational tools and integrate these with existing open access tools to significantly increase the number of metabolites identified in two data repositories, MetaboLights in the UK and GNPS in the USA,. It will also establish common data standards for capturing and curating metabolomics data and for data exchange between the repositories. These approaches will increase the confidence in the annotations provided and greatly enhance the reusability of these data by the global scientific community.

Funded Value:

£638,975

Funded Period:

Jul 22 - Jul 25

Funder:

BBSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

BB/W002345/1

Principal Investigator:

Warwick Dunn

Research Subject:

Tools, technologies & methods (99%)

Research Topic:

Bioinformatics (33%)

Technology and method dev (33%)

Tools for the biosciences (33%)

Organisations

University of Liverpool (Lead Research Organisation)

People	ORCID iD
Warwick Dunn (Principal Investigator)	http://orcid.org/0000-0001-6924-0027
Timothy Ebbels (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Ebbels TMD (2023) Recent advances in mass spectrometry-based computational metabolomics. in Current opinion in chemical biology

Harrieder EM (2022) Critical assessment of chromatographic metadata in publicly available metabolomics data repositories. in Metabolomics : Official journal of the Metabolomic Society

Kirwan JA (2022) Quality assurance and quality control reporting in untargeted metabolic phenotyping: mQACC recommendations for analytical quality management. in Metabolomics : Official journal of the Metabolomic Society

Muhamadali H (2023) Unlocking the secrets of the microbiome: exploring the dynamic microbial interplay with humans through metabolomics and their manipulation for synthetic biology applications. in The Biochemical journal

Wieder C (2022) Single sample pathway analysis in metabolomics: performance evaluation and application. in BMC bioinformatics

Wieder C (2024) PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration. in PLoS computational biology

Winder CL (2022) Providing metabolomics education and training: pedagogy and considerations. in Metabolomics : Official journal of the Metabolomic Society

Research Databases and Models
Engagement Activities


Title	Additional file 2 of Single sample pathway analysis in metabolomics: performance evaluation and application
Description	Additional file 2. Excel spreadsheet containing results of statistical testing of benchmarking results. Mann Whitney U tests were performed for each pairwise combination of methods tested in the effect size simulation (corresponds to Fig. 5 in the main text). P-values were adjusted using Bonferroni FWER correction.
Type Of Material	Database/Collection of data
Year Produced	2022
Provided To Others?	Yes
URL	https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Single_sample_pathway_anal...


Title	Additional file 2 of Single sample pathway analysis in metabolomics: performance evaluation and application
Description	Additional file 2. Excel spreadsheet containing results of statistical testing of benchmarking results. Mann Whitney U tests were performed for each pairwise combination of methods tested in the effect size simulation (corresponds to Fig. 5 in the main text). P-values were adjusted using Bonferroni FWER correction.
Type Of Material	Database/Collection of data
Year Produced	2022
Provided To Others?	Yes
URL	https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Single_sample_pathway_anal...


Description	Hot Topics debate on metabolite annotation - co/chair
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The MEBO Hot Topics debate focused on discussions about current state-of-the-art methods for metabolite identification in metabolomics and included four keynote speakers and a 2 hour debate. I acted as co/chair of the debate and organiser.
Year(s) Of Engagement Activity	2023


Description	Keynote presentation - Scottish Metabolomics Network
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Keynote presentation to discus metabolite annotation in metabolomics and including outputs from current grants
Year(s) Of Engagement Activity	2022
URL	https://static1.squarespace.com/static/57df9bff46c3c466ad42bb3c/t/635bfe71e1936e1e0c99827a/166697330...


Description	Oral presentation at the International Metabolomics Society 2023 conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	A 20 minute oral presentation was provided to approximately 150 attendees of the conference hosted in Niagara Falls, Canada. This was a scientific presentation to researchers from across the world to publicise the availability of the computational resources for both LC-MS and NMR data. Significant numbers of questions and enquyiries arose from the presentation.
Year(s) Of Engagement Activity	2023
URL	https://www.metabolomics2023.org/


Description	Pint of Science - Liverpool
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Public/other audiences
Results and Impact	I provided a 20 minute discussion on the role of metabolomics in biological research
Year(s) Of Engagement Activity	2022


Description	Poster - metabolomics 2022
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Poster presentation at Metabolomics 2023 describing both grants
Year(s) Of Engagement Activity	2022
URL	https://www.metabolomics2022.org/


Description	Scientific meeting - presentation (Thermo Scientific invited speaker)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Invited 25 minute presentation on metabolite identification at a Thermo Scientific customer science day
Year(s) Of Engagement Activity	2022

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications