Exploring the hidden small proteome of a unicellular eukaryote

Lead Research Organisation: University of Cambridge
Department Name: Biochemistry

Abstract

Our bodies are made of very different types of cells: Skin cells are flat and protect our body, while brain cells have cables that pass messages around. Despite being so different, all our cells carry exactly the same information in their genes. What makes them special is what information they use, that is, which genes they switch on and off.

The information on how to make a cell is stored in the form of a DNA molecule. However, this information cannot be read directly: it first needs to be copied into another molecule called messenger RNA (mRNA), from which it can be 'translated' into a protein. Proteins are the components that directly build the cell and make it function. Cells also produce other RNAs that are not translated to make proteins (non-coding RNAs, or ncRNAs), which have other roles in the cell.

The identity of a protein can be predicted from the sequence of the RNA. Moreover, proteins can also be identified directly using specialized techniques. However, both approaches are very inefficient at identifying very small proteins. Thus, these proteins have been largely ignored by researchers, even though there are examples of small proteins with key biological functions.

A new experimental method has been recently developed that allows the detection of every RNA region that is actively translated in a cell. From these data, all proteins can be predicted regardless of their size. The method is called 'ribosome-profiling' after the ribosome, which is the cellular machine that carries out translation. The application of this approach to several organisms has revealed the existence of hundreds of previously unknown predicted short proteins. Many of these translated regions were in RNAs that were not thought to be translated (ncRNAs). In some organisms, these short may proteins represent as much as 20% of all previously known proteins.

Our aims are to identify small proteins systematically and to understand how they work. One way to study a complicated process of the human body is to use a model organism: this is a simpler creature, but similar enough to allow us to learn about ourselves. To study these questions we will use a simple yeast -made of a single cell- that can acquire different forms. We will use different methods to identify all small proteins produced by these cells. We will then remove individual proteins and study how this affects how cells grow and reproduce.

We expect this information will be useful to understand how human cells behave and, eventually, help us devise cures for disease.

Technical Summary

The prediction and experimental identification of small proteins (<100 aminoacids) are challenging. Thus, they have frequently been overlooked and most remain unidentified. However, there is evidence from single-celled and multicellular eukaryotes that small peptides (as short as 11 aminoacids) have essential biological roles.

A recently developed approach, called ribosome profiling, has revolutionized the study of small proteins. This method allows the systematic identification of translated regions regardless of their length. The application of this technique to yeast and animals revealed hundreds of translated short open reading frames (sORFs), located in 5' leader sequences, genes annotated as non-coding RNAs and novel transcripts. If the proteins translated from these loci (sORF-encoded peptides, or SEPs) are stable, this would represent a major increase to the complexity of eukaryotic proteomes. For example, our ribosome profiling of the fission yeast Schizosaccharomyces pombe revealed >900 translated sORFs of 20 codons or longer, which correspond to ~ 20% of the known proteome of this well-studied organism (Duncan and Mata, Nature Stuctural Molecular Biology 2014).

Very little is known about the expression and function of SEPs. We will use the model organism S. pombe to study these questions. We will initially create a comprehensive list of sORFs by performing ribosome profiling in additional conditions, and complement these studies with targeted mass spectrometric approaches to identify stable SEPs. We will then systematically study the phenotype of cells in which sORFs have been inactivated or overexpressed. Finally, we will select a small number of SEPs for detailed characterisation.

These studies will provide an overview of the expression and functional importance of this novel class of proteins. We expect that this work will allow the discovery of general principles that may be applicable to human cells.

Planned Impact

This project will contribute to the training of researchers in key areas of research, and will result in knowledge that may have long-term implications in various areas of medical research as described below.

The biotechnology and pharmaceutical industries are potential beneficiaries of this project, both through the training of highly qualified researchers (point 1) and the knowledge and expertise it will generate (points 2 and 3). In addition, the project may contribute to fighting human disease, which would benefit the general public (points 2 and 3). The 'Pathways to Impact' document discusses in detail how we will ensure that the potential beneficiaries of this project will be reached.

[1] Training and capacity building in functional genomics / systems biology. This project will provide an excellent opportunity for the training of the postdoctoral researcher in the analysis of large scale datasets, both proteomic and genomic. This will be done through the work carried out in the laboratory, as well as through courses and interactions with members of the Cambridge Systems Biology Centre. The provision of scientists trained in these multidisciplinary approaches will be beneficial for the UK industry, especially the biotechnology and pharmaceutical sectors. This is also a key objective of the BBSRC strategic plan 'Exploiting new ways of working', which aims to 'enhance skills and capacity to exploit new tools and approaches e.g. through training for researchers'.

[2] General understanding of human disease: Although very little is known about the function of small peptides, there are examples that demonstrate their function for key biological processes. For example, 30-aminoacid peptides regulate calcium uptake in the heart and have been have been implicated in cardiac pathologies. This proposal aims at identifying general principles of how small peptides regulate biological functions, which may be applicable to human cells.

[3] As described in more detail in the 'Academic Beneficiaries' section, recent work has shown similarities between pathogens of the Pneumocystis genus and fission yeast (in particular, in their meiotic pathways). These organisms cause pneumonia in patients with weakened immune systems (premature babies, AIDS and cancer patients). As Pneumocystis cannot be cultured in vitro, there is a need for model systems that allow the study of their basic biology. Therefore, our results on the function of small proteins might be useful to understand the biology of these pathogens and develop treatments against their infection. To make sure this information reaches the Pneumocystis research community, we will highlight these similarities in peer-reviewed publications, our website and relevant scientific conferences.
 
Description Our bodies are made of very different types of cells: Skin cells are flat and protect our body, while brain cells have cables that pass messages around. Despite being so different, all our cells carry exactly the same information in their genes. What makes them special is what information they use, that is, which genes they switch on and off. The information on how to make a cell is stored in the form of a DNA molecule. However, this information cannot be read directly: it first needs to be copied into another molecule called messenger RNA (mRNA), from which it can be 'translated' into a protein. Proteins are the components that directly build the cell and make it function. It is very difficult to identify small proteins experimentally, and thus these proteins have been largely ignored by researchers, even though there are examples of small proteins with key biological functions. A new experimental method has been recently developed that allows the detection of every RNA region that is actively translated in a cell. From these data, all proteins can be predicted regardless of their size. The method is called 'ribosome-profiling' after the ribosome, which is the cellular machine that carries out translation. The application of this approach to several organisms has revealed the existence of hundreds of previously unknown predicted short proteins. Recent reports have suggested that certain drugs commonly used during this kind of experiment (especially a drug called cycloheximide, or CHX) can lead to an overestimation of the amount of these small peptides. We investigated this potential problem in detail, by performing experiments with and without CHX. Our results were mixed, identifying areas that are affected by this drug, as well as others that are not influenced. These results are important for us and other groups, as they will allow us to design better experiments in the future. We also tried to identify small proteins that are produced by cells in response to adverse environmental conditions (stress). This led to the identification of a new key protein that regulates the production of mRNAs (see above) in response to lack of amino acids (which are the chemical building blocks of proteins).
Exploitation Route Although it is still early days, we expect that the main impact of the grant will be academic. Moreover, we expect that the methods we develop will be of use to the biotechnology industry. Indeed, we have had strong interest from industry to apply some of these appraoches to the study of plant pathogens.

We are also contributing to the competitiveness of the UK by training a postdoctoral researcher in cutting-edge researcher methods in biotechnology.
Sectors Agriculture, Food and Drink,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description Analysis of RNA-seq data pf PP2A stduy 
Organisation University of Oslo
Department Biotechnology Centre of Oslo
Country Norway 
Sector Academic/University 
PI Contribution I analysed RNA-seq data for a study that used fission yeast to study cellular differentiation and stress responses.
Collaborator Contribution My collaborators performed the experiments for the study.
Impact Publication in peer-reviewed journal (Current Biology)
Start Year 2017
 
Description Fil1 study 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution This was a study from our laboratory (including design, performance and analysis of most experiments)
Collaborator Contribution Our collaborators performed essential experiments (chip-seq) that contributed to our study
Impact Publication in peer-reviewed journal (PNAS, Proceedings National Academy Sciences of USA)
Start Year 2017
 
Description Cambridge Festival of Ideas - demonstrator - member of the team 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact >100 members of the public attended event - increased public awareness of scientific research
Year(s) Of Engagement Activity 2017,2018
 
Description Cambridge Festival of Ideas - demonstrator - member of the team 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact >100 members of the public attended event - increased public awareness of scientific research
Year(s) Of Engagement Activity 2016
 
Description Cambridge Science Festival - demonstrator - member of the team 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Stimulated questions and discussion

Designed and run activities for the Departmental Open Day and other events
Year(s) Of Engagement Activity 2011,2012,2013,2015,2017