Global quantification of the yeast proteome

Lead Research Organisation: University of Liverpool
Department Name: Veterinary Preclinical Science

Abstract

An inventory of the proteins in a cell A traditional approach to understanding the living cell is to reduce cell complexity to individual parts . We now recognize that this is no longer enough; we need to take a global view of the cell and study it as an integrated system. This 'systems level' approach requires new technologies, and has been led by the ability to measure all of the messenger RNA molecules, the working copies of the genetic blueprint, in a single experiment. But, these mRNA molecules are intermediaries for the true cellular machines, the proteins, and arguably we ought to be studying the latter. However, for many reasons there are no equivalent approaches for large scale quantitative measurement of all proteins in a cell. Yet, if we are to understand the cell as a complex, dynamic system (protein levels go up and down), as well as the complicated interplay between them, we need to have an 'inventory of parts'. We have devised a new technology that is able to measure, very accurately, the number of molecules of each protein and we wish to take on the challenge of building a protein inventory for the best studied cell, that of the baker's yeast. To supplement these data, we also know how to measure how rapidly the parts (proteins) of the cell are made and recycled. This will be the first inventory that has been built that leaves the yeast proteins unaltered whilst being measured and will be of great value to the biological community. To do this we need to roll this technique out on a grand scale to attempt to quantify over 4000 proteins, requiring a long-term project with expertise in yeast biology, protein chemistry, mass spectrometry and bioinformatics. The protein parts are mostly assembled into machines that do the work of the cell. By understanding how such complex machinery is made, we will begin to understand how the cell balances flexibility of response (in time, and in terms of types of machine) with quality control, manufacturing principles and energy costs. Once generated, we will make all our data available to the biological community both from our own website, and also international repositories.

Technical Summary

System level analysis of the cell requires statistically confident knowledge of the amounts of each protein in the cell. The gold standard approach to protein quantification is based on stable isotope internal standards and mass spectrometric determination of the analyte signal relative to that of the standard. For even a simple proteome, such as Saccharomyces cerevisiae, this is a daunting challenge, but, following our discovery and development of QconCAT technology, is now feasible. QconCATs are artificial proteins that comprise concatamers of proteolytic peptides, each of which is an internal standard for quantification of an analyte protein. We will design and build approximately 200 QconCATs to quantify at least 4000 yeast proteins in a demanding study between two Universities with a long track record of collaboration and innovation. We will also conduct robust quality control measures, ensuring very low technical variance, as well as using biological replicates to assess biological variability on a per protein basis. In addition to quantification of each of these proteins, we will use incomplete metabolic labelling to assess the rate at which each protein is turned over (synthesised and degraded) in the cell. These two parameters (quantity and degradation rate) complete the 'state equation' for protein expression, linking transcript level and translational activity and permitting development of a new model of global protein expression. We will generate a wholly new data set that can be used by biologists across the yeast community, and which will inform and develop new systems level analyses of this important model organism. Joint with BB/G009058/1.

Publications

10 25 50
 
Description We lack the ability to count the number of copies of proteins in a cell in a large scale, efficient manner.

Programme still underway. We are preparing a series of outputs that will demonstrate a high quality, quantified yeast data set that can be used as a gold standard for method development, to inform future systems models and to build a model of proteome dynamics and proteostasis.

We have developed technology to deliver absolute quantitative of the proteins present in a eukaryotic cell, using yeast as an example system. The method is called "QconCAT" and we have now successfully (and directly) quantified over 1100 proteins. Matched numbers are also being generated for protein turnover. Having required some of the mass spectrometry data on a more modern instrument we are now able to offer more definitive conclusions on the effects of epitope tagging on general proteome, by measuring the changes in intracellular protein abundance in yeast for the tagged protein (on-target) and the rest of the proteome (off-target). This shows that there are often large changes to the proteome, which has implications for biotech researchers using epitope tagged proteins for various applications (such as protein-protein interaction mapping and protein quantification). Moreover, it has been compared with transcriptome profiling, creating a stronger view of the dynamics of proteostasis.

The QconCAT approach is now well established, and there have been over 1250 citations for the original work. It is broadly used by the scientific community.

Defining intracellular protein concentration is critical in molecular systems biology. Although strategies for determining relative protein changes are available, defining robust absolute values in copies per cell has proven significantly more challenging. Here we present a reference data set quantifying over 1800 Saccharomyces cerevisiae proteins by direct means using protein-specific stable-isotope labeled internal standards and selected reaction monitoring (SRM) mass spectrometry, far exceeding any previous study. This was achieved by careful design of over 100 QconCAT recombinant proteins as standards, defining 1167 proteins in terms of copies per cell and upper limits on a further 668, with robust CVs routinely less than 20%. The selected reaction monitoring-derived proteome is compared with existing quantitative data sets, highlighting the disparities between methodologies. Coupled with a quantification of the transcriptome by RNA-seq taken from the same cells, these data support revised estimates of several fundamental molecular parameters: a total protein count of ~100 million molecules-per-cell, a median of ~1000 proteins-per-transcript, and a linear model of protein translation explaining 70% of the variance in translation rate. This work contributes a "gold-standard" reference yeast proteome (including 532 values based on high quality, dual peptide quantification) that can be widely used in systems models and for other comparative studies.

The approach was also used to focus on specific proteins - notably the chaperone network in yeast. Chaperones are fundamental to regulating the heat shock response, mediating protein recovery from thermal-induced misfolding and aggregation. Using the QconCAT strategy and selected reaction monitoring (SRM) for absolute protein quantification, we have determined copy per cell values for 49 key chaperones in Saccharomyces cerevisiae under conditions of normal growth and heat shock. This work extends a previous chemostat quantification study by including up to five Q-peptides per protein to improve confidence in protein quantification. In contrast to the global proteome profile of S. cerevisiae in response to heat shock, which remains largely unchanged as determined by label-free quantification, many of the chaperones are upregulated with an average two-fold increase in protein abundance. Interestingly, eight of the significantly upregulated chaperones are direct gene targets of heat shock transcription factor-1. By performing absolute quantification of chaperones under heat stress for the first time, we were able to evaluate the individual protein-level response. Furthermore, this SRM data was used to calibrate label-free quantification values for the proteome in absolute terms, thus improving relative quantification between the two conditions. This study significantly enhances the largely transcriptomic data available in the field and illustrates a more nuanced response at the protein level.
Exploitation Route This is a clear demonstration of the utility of the QconCAT technology.
All of the genes we designed are available to the broader community.
Sectors Agriculture, Food and Drink,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description This is a five year programme that has a simple-stated but complex aim of quantifying an entire proteome of a key organism of biotechnological and biological significance - the yeast Saccharomyces cerevisae. It is a basic science programme that will impact on multiple research areas, in industry and academia. Our com;lete set of accurately, quantified yeast proteins has been used in several studies that have aimed to develop a clearer model of protein abundance and its control in the cell. The QconCAt approach has been developed subsequently, to include DOSCATS and the MEERCAT approach for high level multiplexing.
First Year Of Impact 2017
Sector Agriculture, Food and Drink,Pharmaceuticals and Medical Biotechnology
 
Description Cafe Scientifique Liverpool 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Talk, with Prof Jn eHurst, about how proteins are made, and how their shape dictates function.
Year(s) Of Engagement Activity 2015
 
Description Cafe Scientifique, Glasgow 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact It's all about the actors, darling!

I had the pleasure of delivering a Café Scientifique discussion in Glasgow in May. This is a very long running forum, and has been run by Professor Mandy McLean and her colleagues since 2004 (Prof McLean received an MBE in 2010 for her public engagement activities - who says outreach doesn't get recognised?).

The title of my talk was 'The cell, a factory run by actors'. The format was a 20minute introduction (no slides, no projector -a refreshing change that would do us all good from time to time - I prepared so much better without the crutch of slides to prop me up) - I introduced myself as a protein chemist, discussed proteins I had discovered in my career, and the fun to be had in naming them (the darcin story always gets a good reception) and then explained how 'proteins' (the term coined by Berzelius in 1838 from the Greek p??te??? ('proteios'), meaning 'of the first rank') had been embedded in literature and arts (think Kurt Vonnegut in 'Cat's Cradle and 'Space Oddity', jus for starters) but I also quoted one of my favourite books, 'For the Love of Enzymes: The Odyssey of a Biochemist' in which he said, paraphrasing "DNA and RNA provide the script, but proteins are the actors".

I then directed the audience (between 30 and 40 people) to the first question on my short pub quiz. They were challenged to calculate how many possible proteins of 300 amino acids there could be (prefaced by a chat about polymers, building blocks and the significance of precise order). The audience did it! As Douglas Adams might have said, the answer is 'a hugely, mindbogglingly big number' and far exceeds the number of atoms in the Universe. We then addressed the logical outcome, that the evolution of life on this planet very quickly got locked into a tiny little corner of the hyperdimensional space of 'all possible proteins', and that 'out there' in that hyperdimensional space, there were perfect antimicrobials, cures for all diseases, proteins that could support green chemistry, proteins that were as clear and sparkled like diamonds. If only we knew how to get to them (and synthetic biology is not the answer).

The second part of my short introduction talked about complexity. We discussed the size of a yeast cell (100 cells end to end in one millimetre) and the complexity of this cell compared to an Airbus A3800 - the yeast cell has about 60 million molecules, the A380 has only 4 million parts, which led to the final part of my introduction - how do you manage this complexity, controlling the number of each protein a cell needs, and changing those numbers to respond to demand (stimuli) Is the cell a 'just in time' manufacturer, a 'just in case' manufacturer or a 'rapid recycler'? 

The following discussion (90minutes, with a very welcome bar break in the middle) was fabulous. This was a switched on audience - we ranged from pre-biotic evolution through panspermy to insulin secretion as a 'just in time' process. Inevitably, the darcin story took us back to a detailed discussion about the sex life of the mouse, a change to amuse, inform and extol the value of multidisciplinary collaboration with Jane Hurst and colleagues in Leahurst as behavioural ecologists par excellence. Inevitably we addressed the issue of how we do this work and the need for protein chemists to be 'technologically overstimulated' -especially for the big projects like our recently published study that quantified the number, in copies per cell, of over 2,000 yeast proteins.

I loved every minute of it. The QandA session was chaired by Vanessa Collingridge, broadcaster and writer, who was terrific.

To any of my colleagues who are thinking about this, my advice is 'go for it!'. Leave the slides behind, don't overplay the minutiae, and enjoy two hours in the company of an interested and intelligent audience who challenge you to jump around in your favourite playground - the subject that brings you in to work every morning with a bounce!

Rob
Year(s) Of Engagement Activity 2016
 
Description STEM 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Built a strong relationship with Winstanley College, leading to joint grants from RSC and RSoc.
Year(s) Of Engagement Activity 2014,2015,2016