High-throughput Differential Expression Proteomics
Lead Research Organisation:
Imperial College London
Department Name: Institute of Biomedical Engineering
Abstract
In 2001, a major milestone was reached with the publication of the draft sequence of the human genome. It has now become apparent that there are far fewer protein-coding genes in the human genome than proteins in the human proteome. Whilst the genome is relatively stable, each tissue exhibits radically different protein expression that also changes dynamically over its life cycle and with environmental stimulus. Proteomics is therefore playing a major role in elucidating the functional role of many novel genes and their products, as well as in understanding their involvement in biologically relevant phenotypes both in normal cellular processes and disease. Differential proteomics has become a vital tool in the development of earlier and more accurate screening and diagnostic tests for the detection and treatment of disease. Protein biomarkers are discovered through determination of protein expression that changes uniquely through early progression of a disease state. These biomarkers can then be targeted in the development of non-invasive diagnosis, or used as indicators of the efficacy of new medications in drug discovery. The high-throughput discovery of protein biomarkers and the screening of all human proteins to ascertain their functions and interactions are the two major biology driven challenges in proteomics today.These large-scale challenges are too great for the resources of a single laboratory, so open international collaborations are essential and are being championed by the Human Proteome Organisation (HUPO - http://www.hupo.org/). HUPO is an international consortium that promotes the development and awareness of proteomics research and facilitates scientific collaborations between HUPO members and its initiatives. One such initiative is the Brain Proteome Project (BPP / http://www.hbpp.org/). The aims of the BPP are:- To analyse the brain proteome of human and mouse models in healthy, neurodiseased and aged states with emphasis on Alzheimer's and Parkinson's diseases.- To advance knowledge of neurodiseases and aging for developing new diagnostic approaches and medications.- To make neuroproteomic research and its results available in the scientific community and society.The brain is the most complex tissue of higher organisms, and therefore elucidating the protein complement of the brain is the upper limit of a significant challenge to today's current technologies in proteome analysis. The UK is playing a major role in HUPO, significantly through the HUPO Proteomic Standards Initiative (PSI - http://psidev.sourceforge.net/) hosted by the European Bioinformatics Institute, Hixton, Cambridge. However, the UK is under-represented in the BPP and notably in proteome informatics research as a whole. The two greatest technical barriers to large-scale proteomic analyses are:- The need for considerable expert manual interaction in differential expression proteomics. With conventional techniques errors propagate down the pipeline and so considerable expert manual validation is also required, which adds significant subjectivity.- Marked protocol variation in proteomic workflows between laboratories, leading to heterogeneity of results and therefore challenging results integration and cross-validation issues. To lift these barriers, the proposed fellowship aims to underpin proteomics research with an automated proteome informatics pipeline that:- Integrates the statistical power of multiple replicated experiments in order to elucidate all information, so that the accuracy of differential analysis and expression quantification increases to a level where full automation is possible and subjectively is removed.- Build up a statistical formation model of differential expression proteomics from a history of proteomics experiments, to compare and contrast the sensitivity of subtly different proteomic sample preparation, separation and identification protocols for use in subsequent experiment design.
Publications
Dowsey AW
(2008)
Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.
in Bioinformatics (Oxford, England)
Chen SS
(2011)
Cardiovascular magnetic resonance tagging of the right ventricular free wall for the assessment of long axis myocardial function in congenital heart disease.
in Journal of cardiovascular magnetic resonance : official journal of the Society for Cardiovascular Magnetic Resonance
Hoogland C
(2010)
Guidelines for reporting the use of gel image informatics in proteomics.
in Nature biotechnology
Dowsey A
(2008)
The Future of Large-Scale Collaborative Proteomics
in Proceedings of the IEEE
Dowsey AW
(2010)
Image analysis tools and emerging algorithms for expression proteomics.
in Proteomics
Zhang Y
(2015)
Streaming visualisation of quantitative mass spectrometry data based on a novel raw signal decomposition method.
in Proteomics
Dowsey A
(2010)
Proteome Bioinformatics
Description | There is currently a total disconnect between mass spectrometry (MS) expression quantification and downstream goals such as identification, differential analysis and pathway modelling. There is substantial complexity in raw MS data, but it is viewed as confounding rather than a wealth of information to be harnessed. The established approach is reductionist, converting the raw data into a symbolic representation of peaks at the earliest stage, thus propagating errors and failing to present statistical evidence. In this Fellowship I have brought detailed knowledge-based Bayesian methodology right to the raw MS data acquisition stage of the bioinformatics pipeline for the first time. The resulting seaMass framework is the first method to harness a holistic formation model of biological knowledge and physical modelling to describe the formation of raw mass spectra. Because the framework learns the range of isotope distributions possible, it is 15 times more accurate than the ubiquitous averagine model, enabling coincident peptides to be quantified for the first time. With novel use of the appropriate Poisson noise model, I demonstrated accurate separation of mixtures by their morphological diversity. This is also the cornerstone for solving a single Bayesian model that borrows strength across peak shape/skew, periodic chemical baseline and the isotope distribution range at every charge state, leading to a step-change in performance: peptide quantification despite periodic baseline contamination and detection of biologically relevant signals barely discernable from noise. Furthermore, seaMass shows great potential for broader applications: Capability to directly integrate prior knowledge and thus borrow strength across all facets of MS; Direct applicability to metabolomics and other MS modalities; The ability to handle the additional complexities of translational and clinical application. This promise was presented in a subsequently successful application for an MRC Methodology Programme New Investigator Research Grant (NIRG), MR/L011093/1 (2014-2017). The Fellowship has also enabled the forging of significant international collaborations. With a visiting position in Prof. Mike Dunn's facility at University College Dublin and collaboration with Prof. Frederique Lisacek (Swiss Institute of Bioinformatics) we composed a wide-ranging review of informatics for proteomics and book chapter. Moreover, the six months as a visiting researcher in the Texas Medical Centre enabled a close working environment with cutting edge clinical biochemistry practitioners. In particular, the Fellowship kick-started a long-term synergy with MD Anderson Biostatistician Prof. Jeffrey Morris, whose signal-based Functional Mixed Modelling (FMM) approach is a direct complement to the seaMass framework. This collaboration would go on to bear fruit thanks to BBSRC award BB/K004158/1 (2013-2014). |
Exploitation Route | seaMass significantly improves a fundamental step in the interpretation of mass spectrometry data, which is used pervasively in industry as well as academia. With collaborations in proteomics, metabolomics and translational medicine at the Centre of Advanced Discovery and Experimental Therapeutics (CADET), University of Manchester, we are applying seaMass to advanced proteomic and metabolomic workflows. This will provide an exemplar and comprehensive validation for subsequent dissemination to the omics community at large. |
Sectors | Agriculture Food and Drink Environment Healthcare Pharmaceuticals and Medical Biotechnology |
URL | http://www.seamass.net/ |
Description | This Fellowship provided the basic research underpinning by subsequent promotion to Lecturer and my current BBSRC and MRC research programme (BB/K004158/1, BB/K016733/1, BB/L018616/1, BB/L018462/1, MR/L011093/1). Work towards economic and societal impact is ongoing. |
Description | Investing in Success |
Amount | £3,500 (GBP) |
Organisation | University of Manchester |
Sector | Academic/University |
Country | United Kingdom |
Start | 04/2012 |
End | 06/2012 |
Description | University of Liverpool EPSRC Impact Accelerator |
Amount | £21,844 (GBP) |
Organisation | University of Liverpool |
Sector | Academic/University |
Country | United Kingdom |
Start | 03/2016 |
End | 06/2016 |
Title | The Peptide Simplex |
Description | A new type of feature detection in mass spectra which is able to detect and quantify overlapping features as well as those barely discernible above the noise floor. |
Type Of Material | Computer model/algorithm |
Year Produced | 2010 |
Provided To Others? | Yes |
Impact | Provided the proof-of-concept basis for grant awards BB/K004158/1, BB/L018616/1 and MR/L011093/1. |
URL | http://www.cadetbioinformatics.org/research/ms/peptide-simplex/ |
Description | Prof Jeffrey Morris |
Organisation | University of Texas |
Department | M. D. Anderson Cancer Center |
Country | United States |
Sector | Academic/University |
PI Contribution | Translation of Prof Morris' Wavelet Functional Mixed Model methodology to the proteomics LC-MS (Liquid Chromatography - Mass Spectrometry) field. |
Collaborator Contribution | Access to Prof Morris' expertise and unpublished methodology in order to create our novel differential analysis workflow for raw LC-MS data. |
Impact | Two publications [Liao et al, IEEE ISBI 2014; Dowsey et al Proteomics, 2010, 4226-57] plus a successful submission to the September 2014 BBSRC Bilateral NSF/BIO-BBSRC responsive mode call [BB/M024954/1]. |
Start Year | 2009 |
Title | seaMass |
Description | The seaMass software is our open source dissemination route for the LC-MS (Liquid Chromatography - Mass Spectrometry) analysis algorithms developed by our group, including signal restoration and visualisation. |
Type Of Technology | Software |
Year Produced | 2014 |
Open Source License? | Yes |
Impact | The software has only recently been released, but there is strong interest for its incorporation into the ProteoSuite's consortium's BBSRC BBR funded user-centric proteomics software (http://www.proteosuite.org/?q=aboutus). |
URL | http://www.biospi.org/research/ms/seamass/ |