Rapid proteome profiling using positional signature peptides
Lead Research Organisation:
University of Manchester
Department Name: Life Sciences
Abstract
The proteome defines the entire complement of proteins expressed by a cell in a particular state. The 'protein world' is a challenging area to study, and if we are to conquer this world, currently we use a strategy based on 'divide and conquer', breaking up the proteome into smaller fragments to make them amenable to analysis. But, if we increase the intrinsic complexity of the protein world by fragmentation, aren't we making it even more difficult to analyse? At first glance, yes, but suppose we could capture just one, information rich fragment that was able to report on the parent protein - recovering information just became 50 times easier! This strategy for proteome simplification is exactly what we propose. We have worked out a way to reduce the complexity of a proteome (perhaps 500,000 fragments) to a highly information-rich subset (perhaps 5,000 fragments) using novel chemical tricks. We now need to develop and refine our approach, and build the matching informatics tools to make the most effective use of this limited set of fragments - there is a wealth of information buried in these. An approach such as this would be of great value to the community of scientists who study proteomes. The methodology is simple to deliver, cheap, and requires no sophisticated instrumentation over and above that which we would find in a typical proteomics laboratory. It has the potential to revolutionise the way in which we study proteomes
Technical Summary
Global analysis of proteomes is a necessity for systems biology and for high throughput proteome screening. The most promising and readily available methods are based on peptide-level analysis, but the conversion of a proteome from protein space to peptide space increases the analyte complexity by a factor of 40-50 fold. This complexity continues to dog mass spectrometry-based approaches and presents a real bottleneck for truly genome-wide proteomics in a single experiment. We have recently published a novel approach to proteome simplification in peptide space, through selective isolation and recovery of N-terminal peptides (positional signature peptides, PSPs). This reduces analyte complexity to one peptide per protein, which greatly simplifies the protein identification problem (currently a bane to MudPIT style proteomics) since the unique peptides are usually diagnostic for the parent proteins. This also increases the depth to which a proteome can be characterised. Simplification of the proteome analytical challenge to the level of one peptide per protein brings within reach other aspects, including absolute quantification and determination of intracellular stability. However, it is now very clear that existing proteome software tools, predicated on the presence of more than one peptide for each protein, do not perform well with PSPs. Indeed, our preliminary studies suggest that many more true PSPs are buried in the spectra that are readily obtained, and improved informatic approaches can take this technology forward and deeper into the global proteome. Bespoke software and database searching strategies will be developed in this proposal in tandem with refined experimental procedures, making software available as open source tools. This will then be applied to demonstrator projects on the E.coli and serum proteomes to show how it may be exploited for rapid proteome profiling, making all data available via local databases and external repositories.
Publications
Wright JC
(2010)
Cross species proteomics.
in Methods in molecular biology (Clifton, N.J.)
Wedge DC
(2011)
FDRAnalysis: a tool for the integrated analysis of tandem mass spectrometry identification results from multiple search engines.
in Journal of proteome research
Jones AR
(2009)
Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines.
in Proteomics
Jones AR
(2012)
The mzIdentML data standard for mass spectrometry-based proteomics results.
in Molecular & cellular proteomics : MCP
Hubbard, Simon; Jones, Andy
(2009)
Proteome Bioinformatics
Hubbard SJ
(2010)
Computational approaches to peptide identification via tandem MS.
in Methods in molecular biology (Clifton, N.J.)
Eyers CE
(2011)
CONSeQuence: prediction of reference peptides for absolute quantitative proteomics using consensus machine learning approaches.
in Molecular & cellular proteomics : MCP
Eisenacher M
(2009)
Getting a grip on proteomics data - Proteomics Data Collection (ProDaC).
in Proteomics
Blakeley P
(2010)
Investigating protein isoforms via proteomics: a feasibility study.
in Proteomics
Description | We improved some proteomics informatics methods for integration of multiple search engines, and improved methods/data standards for quantitative proteomics and submission to repositories. |
Exploitation Route | through publications, the informatics findings can be used for further development of the field and provide insights in to databases design |
Sectors | Digital/Communication/Information Technologies (including Software) Healthcare Pharmaceuticals and Medical Biotechnology |
Description | The technology we investigated has lead on to further research in quantitative and qualitative proteomics in our labs, and the general technique of n-terminal peptide enrichment is still in use - though other labs have developed it in different directions. |