Structure and folding of proteins with identical tandemly arrayed domains

Lead Research Organisation: University of York
Department Name: Biology

Abstract

Proteins carry out a vast array of essential processes in the human body. To do this, the thousands of different proteins adopt different shapes (folds) that are appropriate to their task. However, it is very easy for these shapes to become disrupted and such "misfolded" proteins tend to stick together and play an important role in many diseases (such as Alzheimer's disease) that are particularly associated with ageing. So correct protein folding, and the prevention of misfolding, are very important for health. In the proposed research, we aim to study a newly discovered protein fold that is found in a protein, SasG, from Staphylocococcus aureus. It is interesting because, given what we currently know about protein folding, SasG should be very unstable-but it isn't and should also misfold-but it doesn't. When molecules are discovered that challenge our current understanding of an important process, such as protein folding, it is very exciting because by studying them we are likely to significantly improve our knowledge of that process. That is, the fact that SasG appears to "break the rules" of protein stability and folding means we don't fully understand the rules. What we aim to do is to understand why SasG is stable, how it folds-up to its normal shape and how it avoids misfolding. By studying how proteins fold, why they misfold and how some proteins avoid misfolding we will make an important contribution to a wide range research. We might also help in the future development of therapeutic strategies to reduce the protein misfolding that leads to disease. SasG is a very repetitive, elongated protein; such proteins have potential uses in nanotechnology. Thus we will also perform experiments that will help establish its potential. Finally, as SasG has a novel structure, we will solve structures of a another protein with characteristics related to SasG, to see if it also has unusual structure that appears to break the rules.

The strategy we will use is straight-forward, so is likely to work. We will change many features of SasG, one at a time, to see if we can make it unstable and to misfold. This will then tell us which features are important for stability and correct folding. We will use a combination of both well-established and state-of-the-art techniques to perform the experiments. The well-established techniques involve the use of increasing concentrations of a chemical that makes proteins unfold; the more unstable the protein, the less chemical required for unfolding. Related experiments allows us to extract the rate of folding and unfolding. Other more recent, but still well-established, techniques involve pulling on the protein to measure the force required for unfolding. In the state-of-the-art techniques we will put two labels on the protein that are able to "communicate" dependent on how far apart they are. We place them so they will be too far apart to "talk" when the protein is correctly folded but close enough when the protein is misfolded. The use of lasers allows the communication between the labels to be detected on single molecules.

The work will be carried out in the labs of Dr Jennifer Potts at the University of York (where the structures of SasG were recently determined) and of Prof Jane Clarke, at the University of Cambridge, who is an expert on protein folding and misfolding and has recently developed the sensitive methods to detect misfolding events.

Technical Summary

Very recent work from the Clarke lab used single-molecule techniques to detect misfolding events in a synthetic construct containing adjacent domains of the same fold and identical sequence. These misfolding events disappeared when the sequence identity between domains was reduced to ~40%. Many multi-domain proteins contain strings of domains with the same fold and these misfolding events are thus the likely explanation for the apparent evolutionary pressure to maintain sequence diversity in adjacent domains in almost all such proteins.
Very recent work in the Potts lab has solved the structure of a native protein with virtually identical tandemly arrayed domains. In SasG, a Staphylococcus aureus surface protein involved in bacterial biofilm formation, domain identity is driven at the DNA level and appears to be a mechanism facilitating antigenic variation. The protein domains lack the normally expected compact hydrophobic core and we propose this novel structure has evolved to avoid misfolding events when the presence of identical domains is otherwise advantageous. The overall aim of the research is to use SasG to enhance our understanding of how proteins fold and, importantly, how they avoid misfolding.

The specific objectives of the proposed research are to:
1. Identify the sources of thermodynamic stability and to determine the folding mechanism of the single layer beta-sheet SasG domains.
2. Test a number of different SasG constructs for rare misfolding events using a recently-developed single molecule FRET technique. Are specific residues important for resisting misfolding?
3. Understand the role of SasG domains and specific residues in resisting mechanical unfolding.
4. Determine the structure of a different bacterial protein with identical tandemly arrayed domains (to establish if it also has an unusual fold).
5. Lay the groundwork for potential applications of designed SasG constructs of specific length in nanotechnology.

Planned Impact

This research proposal seeks to investigate a novel multidomain protein whose structure challenges many of our ideas about how proteins fold and how they avoid misfolding. Our hypothesis is that it is the extraordinary non-globular structure that protects the proteins against misfolding. SasG, the protein of interest, is one of a series of extra-cellular bacterial proteins that contain nearly identical duplicated repeats. We are also going to investigate the structure of a second such protein where no structural information is available at all.

One of the groups of beneficiaries is the members of the community of protein folding researchers, in particular to those who use computational methods. Our studies of these proteins will allow us to reassess our understanding of how proteins fold. SasG is predicted to be unfolded by most algorithms which predict disorder. Yet it is clearly not. In the "omics" era computational methods have become particularly important in the analysis and prediction from the wealth of genomic and structural data. In the fields of bioinformatics and structure prediction, force fields and algorithms are based on experimental data, so studies of novel folds such as that of SasG are essential for furthering our understanding.

The two post docs to be employed on this project are also beneficiaries. The strong collaboration between our two groups, which provide different yet complementary set of skills, represents an outstanding opportunity for two young scientists to be trained in world class laboratories. They will then be in an excellent position to continue their career, whether in academia, or very likely, in industry, where there is a real national demand for highly skilled scientists with the skill set we are offering. Of the many students and post-docs to have passed through the Clarke laboratory, for instance, a significant proportion are now employed in the biotechnology industry, some in small start-up companies, and others in companies which have now been taken into the mainstream pharmaceutical industry sector (such as Medimmune).

Although not explicitly a project which aims to have a technological outcome, this project may have some future applications, resulting in some economic or well-being benefits. The proteins we are investigating are involved in biofilm formation. Biofilms are of significant economic importance both in terms of medicine (where they are detrimental as they are involved in bacterial infection) and in the industrial context (where they may be either detrimental or helpful). We are not directly investigating biofilm formation, yet a fundamental understanding of the biophysical and mechanical properties of the proteins which mediate biofilm formation may be valuable to underpin future research in this area. The use of biological molecules in nanotechnology is in its infancy. We will investigate whether SasG, a highly soluble, rigid molecule may be suitable to provide molecules in the nanometre length scale that are tuneable both in terms of size and in terms of sites which can be functionalised.

Publications

10 25 50
publication icon
Devine P (2017) Investigating the Structural Compaction of Biomolecules Upon Transition to the Gas-Phase Using ESI-TWIMS-MS in Journal of the American Society for Mass Spectrometry

publication icon
Gruszka DT (2012) Staphylococcal biofilm-forming protein has a contiguous rod-like structure. in Proceedings of the National Academy of Sciences of the United States of America

publication icon
Gruszka DT (2016) Disorder drives cooperative folding in a multidomain protein. in Proceedings of the National Academy of Sciences of the United States of America

publication icon
Whelan F (2021) Periscope Proteins are variable-length regulators of bacterial cell surface interactions. in Proceedings of the National Academy of Sciences of the United States of America

publication icon
Whelan F (2019) Defining the remarkable structural malleability of a bacterial surface protein Rib domain implicated in infection. in Proceedings of the National Academy of Sciences of the United States of America

 
Description Bacteria such as Staphylococcus aureus use proteins on their surface to adhere together in colonies called biofilms. These are a particular problem in infections of medical devices. We investigated how the structure of these proteins are related to their function. We have demonstrated, using a wide variety of biophysical techniques, that the surface protein forms a single chain protein rod on the 100 nm scale and that this rod is unusually mechanical strong. Our work has also revealed ho this structure is formed. Our results suggest that the rod formation of the protein is important in projecting another region of the protein (that binds to human cells) away from the surface of the bacteria.
Exploitation Route Our work demonstrates that protein rods of tuneable length can be formed. These might have applications in nanotechnology, protein engineering and synthetic biology. Our work also aids the understanding of the role of these proteins in biofilm formation and host colonisation.
Sectors Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description Impact is still developing. Two postdoctoral researchers have received excellent training in a wide range of biophysical techniques. They have attended international conferences, raising the profile of this UK-based research. The work also led to a conference organised in the UK by the PI and a US-based collaborator; it took place in 2015, several international scientists spoke and feedback was excellent.
First Year Of Impact 2015
Sector Education
Impact Types Economic

 
Description A cryoEM facility
Amount £1,600,000 (GBP)
Funding ID 206161/Z/17/Z 
Organisation Wellcome Trust 
Department Wellcome Trust Bloomsbury Centre
Sector Charity/Non Profit
Country United Kingdom
Start 08/2017 
End 07/2022
 
Description BBSRC ALERT14
Amount £319,000 (GBP)
Funding ID BB/M012697/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2015 
End 01/2016
 
Description BBSRC IAA
Amount £19,907 (GBP)
Funding ID BB/S506795/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 08/2018 
End 01/2019
 
Description BHF Non-clinical PhD studentship
Amount £107,000 (GBP)
Funding ID FS/14/72/31067 
Organisation British Heart Foundation (BHF) 
Sector Charity/Non Profit
Country United Kingdom
Start 02/2016 
End 01/2019
 
Description BHF Project Grant
Amount £235,981 (GBP)
Funding ID PG/16/5/31912 
Organisation British Heart Foundation (BHF) 
Sector Charity/Non Profit
Country United Kingdom
Start 09/2016 
End 08/2019
 
Description British Heart Foundation Non-Clinical PhD Studentship
Amount £107,157 (GBP)
Funding ID FS/17/11/32688 
Organisation British Heart Foundation (BHF) 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2017 
End 09/2020
 
Description MRC Discovery Award
Amount £679,802 (GBP)
Funding ID MC_PC_15073 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 03/2016 
End 09/2017
 
Description Multi-user equipment grant
Amount £475,000 (GBP)
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 09/2013 
End 08/2018
 
Description Repetitive proteins_cam 
Organisation University of Cambridge
Country United Kingdom 
Sector Academic/University 
PI Contribution My laboratory performed the structural biology on the project.
Collaborator Contribution The collaborator's laboratory performed the protein folding studies.
Impact This is a multi-disciplinary collaboration; protein structure, protein folding, single molecule fluorescence, atomic force microscopy
Start Year 2011
 
Description Repetitive proteins_leeds 
Organisation University of Leeds
Country United Kingdom 
Sector Academic/University 
PI Contribution My laboratory performed the structural biology on the project.
Collaborator Contribution The collaborator's laboratory performed atomic force microscopy experiments.
Impact The project is multi-disciplinary; structural biology, protein folding, single molecule fluorescence, atomic force microscopy.
Start Year 2012
 
Description repeats_ebi 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution I have developed a hypothesis that is now being investigated.
Collaborator Contribution Investigating hypotheses based on studies of individual proteins using bioinformatics analyses of large databases.
Impact Multidisciplinary; biophysics and bioinformatics
Start Year 2016
 
Description repeats_sheff 
Organisation University of Sheffield
Country United Kingdom 
Sector Academic/University 
PI Contribution I am co-supervising a PhD student (based in Sheffield) who is working on applications of repeat proteins.
Collaborator Contribution The partner was awarded the studentship and is contributing the ideas and potential applications for the molecular system discovered in my lab.
Impact No outputs to date. Multidisciplinary: structural biology, electronics and nano science
Start Year 2016
 
Description repeats_yale 
Organisation Yale University
Country United States 
Sector Academic/University 
PI Contribution Structural biology
Collaborator Contribution Protein engineering expertise
Impact We are joint organisers of a conference in the UK in 2015
Start Year 2013