The application of time domain processes for the improvement of data quality and enhanced pattern recognition in NMR based metabolomics
Lead Research Organisation:
University of Cambridge
Department Name: Biochemistry
Abstract
Metabolomics is one of the emerging approaches in Life Sciences used in a Systems Biology framework to globally profile the changes in metabolism which accompany a given genetic modification, drug intervention or environmental stimulation. The use of 1H Nuclear Magnetic Resonance (NMR) spectroscopy for this purpose is particularly attractive being relatively cheap on a per sample basis as well as high-throughput. This has made it a particularly useful functional genomic tool for monitoring toxicology changes during the drug safety assessment process, where drugs are dosed typically at doses well below the LD50 level (the lethal dose for 50% of a population). While the acquisition of solution state NMR spectra can be automated, there are several issues that prevent the analysis of these spectra being fully automated. The quality of the spectra obtained from biofluids such as urine and blood plasma, for example, is prone to a number of impairments, especially following drug toxicity. Resonances from drug metabolites may be present in the spectra, obscuring the resonances from lower concentration endogenous metabolites. Furthermore, lipid particles such as LDL, VLDL and HDL in blood plasma and protein in urine may also obscure metabolite resonances. Therefore, the intervention of the analyst is normally required, decreasing sample throughput and introducing variation into the pattern recognition analysis which is used to final determine a metabolic profile associated with a given toxicological intervention. This project aims to develop tools to remove broad components, phase distortions and baseline offset differences from NMR spectra using computational methods that are user friendly and automated. The use of time-domain analysis based on Continuous Wavelet Transform (CWT) and Bayesian modelling framework will be investigated as they have shown promising results in our previous work. In particular, a modification of the Continuous Wavelet Transform known as a Single Voice CWT will be examined as it allows effective separation and subtraction of resonances with different line widths. Bayesian modelling and Reversible Jump Markov Chain Monte Carlo simulation will facilitate the probabilistic estimation of the number of resonances in the signal as well as the estimation of the parameters of the resonances which are critical for quantification of the NMR experiment results. Development of the aforementioned techniques will lead us to an automated time-domain based tool for NMR spectral improvement and quantification. Finally, in conjunction with the metabolic profiling group at GlaxoSmithKline, assessment of the effect of these tools on the pattern recognition tasks that are common in drug toxicity studies will be performed. The tools being developed will initially be tailored specifically to improving toxicity prediction in the drug discovery process through enhanced metabolomic/ metabonomic studies, but will also have a far broader application in metabolomics and bioinformatics. Thus, the project will potentially assist in a broad range of metabolomic projects including those in the pharmaceutical, food and medical industries.
Technical Summary
Statistical analysis of NMR spectra of biological fluids for metabolomics requires a consistent representation of the data from sample to sample. This can be very challenging when variations occur in physical properties and the chemical composition of the biofluids that result in baseline and phase distortions and the obscuring of low concentration metabolites by broad macromolecule resonances. In this project we will address key processing issues important in the development of high throughput NMR based urinary analysis for the drug safety assessment process in the pharmaceutical industry. In collaboration with the Metabolic Profiling Group at GlaxoSmithKline, a number of time domain approaches will be further refined to address the key issues of: 1. the improved detection of resonances from low molecular weight metabolites which are often obscured by protein and other macromolecules; 2. the use of time encoded information in the subsequent pattern recognition to improve biomarker discovery. In this proposal a number of time domain processes will be implemented to assess their efficacy in NMR based metabolomics. This will include the application of single voice continuous wavelet transform (CWT) to target and remove residual water resonances, urea, drug metabolites or the drug vehicle. The use of Bayesian and Markov Chain Monte Carlo (MCMC) methods of processing NMR spectra in the time domain will also be evaluated. The challenging problem is to detect the number of resonances in an automated way. This will be addressed by applying Reversible Jump MCMC technique to calculate a posterior probability density for a spectrum model including probabilistic estimation of the number of resonances that will lead to a fully automated NMR spectra quantification. The aim of this proposal will be to provide robust and user friendly methods for high throughput NMR based metabolomics for the pharmaceutical industry and functional genomics.
Publications
Rubtsov DV
(2007)
Time-domain Bayesian detection and estimation of noisy damped sinusoidal signals applied to NMR spectroscopy.
in Journal of magnetic resonance (San Diego, Calif. : 1997)
Gulston MK
(2008)
A combined metabolomic and proteomic investigation of the effects of a failure to express dystrophin in the mouse heart.
in Journal of proteome research
Brockmöller SF
(2012)
Integration of metabolomics and expression of glycerol-3-phosphate acyltransferase (GPAM) in breast cancer-link to patient survival, hormone receptor status, and metabolic profiling.
in Journal of proteome research
Nabeebaccus AA
(2017)
Nox4 reprograms cardiac substrate metabolism via protein O-GlcNAcylation to enhance stress adaptation.
in JCI insight
Schober D
(2018)
nmrML: A Community Supported Open Data Standard for the Description, Storage, and Exchange of NMR Data.
in Analytical chemistry
Description | We developed a new computer algorithm to process NMR spectra using an approach called Bayesian statistics. This is much better than the classical Fourier Transform and could revolutionise how we process our NMR spectra. |
Exploitation Route | We are currently exploring whether this could be included in a Galaxy workflow for metabolomic experiments. |
Sectors | Education Healthcare Pharmaceuticals and Medical Biotechnology |
Description | Astra Zeneca CASE studentship |
Amount | £40,000 (GBP) |
Organisation | AstraZeneca |
Sector | Private |
Country | United Kingdom |
Start | 09/2016 |
End | 09/2019 |
Description | Technology Development Grant, MetaboFlow - the development of standardised workflows for processing metabolomics data to aid reproducible data sharing and big data initiatives |
Amount | £900,000 (GBP) |
Funding ID | 202952/B/16/Z |
Organisation | Wellcome Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 12/2016 |
End | 11/2019 |
Description | 2nd Metabolomics Sardinian Scientific School |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | 2nd Metabolomics Sardinian Scientific School was aimed at post-grad students new to the field of metabolomics. We gave seminars and workshops in various tools and techniques in metabolomics. |
Year(s) Of Engagement Activity | 2016 |
Description | Cambridge Science Week 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | took part in Cambridge Science week and put on a display on personalised medicine and health using advanced biochemical techniques. |
Year(s) Of Engagement Activity | 2017 |
Description | Sardinian summer school: Metabolomics and more. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Sardinian summer school to spread the use of tools in metabolomics and lipidomics. |
Year(s) Of Engagement Activity | 2017 |
Description | Work experience for two school boys |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Schools |
Results and Impact | Work experience for two school boys: two school boys spent a week in my lab following members around to get some experience of what its like being a scientist. |
Year(s) Of Engagement Activity | 2016 |