The application of time domain processes for the improvement of data quality and enhanced pattern recognition in NMR based metabolomics

Lead Research Organisation: University of Cambridge
Department Name: Biochemistry

Abstract

Metabolomics is one of the emerging approaches in Life Sciences used in a Systems Biology framework to globally profile the changes in metabolism which accompany a given genetic modification, drug intervention or environmental stimulation. The use of 1H Nuclear Magnetic Resonance (NMR) spectroscopy for this purpose is particularly attractive being relatively cheap on a per sample basis as well as high-throughput. This has made it a particularly useful functional genomic tool for monitoring toxicology changes during the drug safety assessment process, where drugs are dosed typically at doses well below the LD50 level (the lethal dose for 50% of a population). While the acquisition of solution state NMR spectra can be automated, there are several issues that prevent the analysis of these spectra being fully automated. The quality of the spectra obtained from biofluids such as urine and blood plasma, for example, is prone to a number of impairments, especially following drug toxicity. Resonances from drug metabolites may be present in the spectra, obscuring the resonances from lower concentration endogenous metabolites. Furthermore, lipid particles such as LDL, VLDL and HDL in blood plasma and protein in urine may also obscure metabolite resonances. Therefore, the intervention of the analyst is normally required, decreasing sample throughput and introducing variation into the pattern recognition analysis which is used to final determine a metabolic profile associated with a given toxicological intervention. This project aims to develop tools to remove broad components, phase distortions and baseline offset differences from NMR spectra using computational methods that are user friendly and automated. The use of time-domain analysis based on Continuous Wavelet Transform (CWT) and Bayesian modelling framework will be investigated as they have shown promising results in our previous work. In particular, a modification of the Continuous Wavelet Transform known as a Single Voice CWT will be examined as it allows effective separation and subtraction of resonances with different line widths. Bayesian modelling and Reversible Jump Markov Chain Monte Carlo simulation will facilitate the probabilistic estimation of the number of resonances in the signal as well as the estimation of the parameters of the resonances which are critical for quantification of the NMR experiment results. Development of the aforementioned techniques will lead us to an automated time-domain based tool for NMR spectral improvement and quantification. Finally, in conjunction with the metabolic profiling group at GlaxoSmithKline, assessment of the effect of these tools on the pattern recognition tasks that are common in drug toxicity studies will be performed. The tools being developed will initially be tailored specifically to improving toxicity prediction in the drug discovery process through enhanced metabolomic/ metabonomic studies, but will also have a far broader application in metabolomics and bioinformatics. Thus, the project will potentially assist in a broad range of metabolomic projects including those in the pharmaceutical, food and medical industries.

Technical Summary

Statistical analysis of NMR spectra of biological fluids for metabolomics requires a consistent representation of the data from sample to sample. This can be very challenging when variations occur in physical properties and the chemical composition of the biofluids that result in baseline and phase distortions and the obscuring of low concentration metabolites by broad macromolecule resonances. In this project we will address key processing issues important in the development of high throughput NMR based urinary analysis for the drug safety assessment process in the pharmaceutical industry. In collaboration with the Metabolic Profiling Group at GlaxoSmithKline, a number of time domain approaches will be further refined to address the key issues of: 1. the improved detection of resonances from low molecular weight metabolites which are often obscured by protein and other macromolecules; 2. the use of time encoded information in the subsequent pattern recognition to improve biomarker discovery. In this proposal a number of time domain processes will be implemented to assess their efficacy in NMR based metabolomics. This will include the application of single voice continuous wavelet transform (CWT) to target and remove residual water resonances, urea, drug metabolites or the drug vehicle. The use of Bayesian and Markov Chain Monte Carlo (MCMC) methods of processing NMR spectra in the time domain will also be evaluated. The challenging problem is to detect the number of resonances in an automated way. This will be addressed by applying Reversible Jump MCMC technique to calculate a posterior probability density for a spectrum model including probabilistic estimation of the number of resonances that will lead to a fully automated NMR spectra quantification. The aim of this proposal will be to provide robust and user friendly methods for high throughput NMR based metabolomics for the pharmaceutical industry and functional genomics.
 
Description We developed a new computer algorithm to process NMR spectra using an approach called Bayesian statistics. This is much better than the classical Fourier Transform and could revolutionise how we process our NMR spectra.
Exploitation Route We are currently exploring whether this could be included in a Galaxy workflow for metabolomic experiments.
Sectors Education,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description Astra Zeneca CASE studentship
Amount £40,000 (GBP)
Organisation AstraZeneca 
Sector Private
Country United Kingdom
Start 10/2016 
End 09/2019
 
Description Technology Development Grant, MetaboFlow - the development of standardised workflows for processing metabolomics data to aid reproducible data sharing and big data initiatives
Amount £900,000 (GBP)
Funding ID 202952/B/16/Z 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 12/2016 
End 11/2019
 
Description 2nd Metabolomics Sardinian Scientific School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact 2nd Metabolomics Sardinian Scientific School was aimed at post-grad students new to the field of metabolomics. We gave seminars and workshops in various tools and techniques in metabolomics.
Year(s) Of Engagement Activity 2016
 
Description Cambridge Science Week 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact took part in Cambridge Science week and put on a display on personalised medicine and health using advanced biochemical techniques.
Year(s) Of Engagement Activity 2017
 
Description Sardinian summer school: Metabolomics and more. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Sardinian summer school to spread the use of tools in metabolomics and lipidomics.
Year(s) Of Engagement Activity 2017
 
Description Work experience for two school boys 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Work experience for two school boys: two school boys spent a week in my lab following members around to get some experience of what its like being a scientist.
Year(s) Of Engagement Activity 2016