Lead Research Organisation: University of York
Department Name: Biology


Proteins are polymers of amino acids that underpin most of the critical biological processes in nature, including in the human body. Proteins can also be produced to perform specific tasks in medicine and biotechnology; frequently this task is to bind to another type of molecule in a highly specific manner. A good example are antibodies; these are a natural part of the human immune system but are also used regularly and routinely in clinical and biotechnological applications. A well-known example in biotechnology is the home pregnancy testing kit; most of these use an antibody that binds to a molecule found in the urine of women who are pregnant.

Antibodies are complex molecules and they are challenging to produce. Significant effort and expense has therefore been focused toward other types of proteins that could perform the task of binding to a specific molecule but with fewer production and operational drawbacks than antibodies. The first requirement of such molecules is that that they can be engineered or evolved to recognize and bind a specific target molecule. This has already been achieved for many examples of proteins that mimic antibody function. A second advantageous property is resistance to aggregation. Aggregation occurs when proteins "clump" together irreversibly; it causes substantial losses in the production of proteins for biotechnology and medicine.

Serendipitously, we have discovered a family of proteins that are highly resistant to aggregation. Remarkably, the first member of this family that we have tested is also highly amenable to chemical engineering; that is it can be modified using synthetic chemistry approaches. One drawback with using proteins in biotechnology is that their composition is limited to chemistry of the 20 naturally-occurring proteinogenic amino acids. Chemical modification of proteins in the laboratory offers scientists the opportunity to introduce new chemical and therefore functional properties to proteins of interest, which can greatly enhance their applications as research, clinical and biotech tools.

Our proposal gathers an interdisciplinary team with expertise in protein biochemistry, chemical biology and engineering. We have designed a series of experiments that will demonstrate exciting ways to exploit the remarkable properties of the protein we have identified. The data we generate will help to show that this new protein can be modified and used for a wide range of applications in bioscience.

The application is high risk because we propose a very stringent and unusual range of experiments for these biological molecules; most proteins would not remain functional under the conditions we will be using. But the application also has high potential for reward because the experiments proposed are for the first of a new family of proteins we have discovered with similar properties (particularly the ability to unfold and refold at high concentration without aggregating); thus we think this work will lead to a very large number of proteins with broad new functionality to transform bioscience and biotechnology.

Technical Summary

Previous studies have shown that multi-domain proteins are prone to aggregation if they contain tandemly-arrayed domains with high sequence identity. Thus, in an apparent evolutionary strategy to minimise inter-domain aggregation, most adjacent domains in multi-domain proteins have less than 50% sequence identity. However, we noticed that many bacterial surface proteins have highly identical tandemly-arrayed protein domains. We realised that these proteins were likely to have evolved aggregation resistance and show that this is the case in preliminary data. We also realised that we could exploit the properties of these domains for use as engineered binding scaffolds. Importantly, their natural resistance to aggregation would provide a key advantage over other scaffolds. To be useful as a binding scaffold the domains need to be highly engineerable or evolvable. We report preliminary data that demonstrates the extreme engineerability of the first domain we tested. Importantly, aggregation resistance is maintained following significant modification of the native primary sequence.

Three high risk, high reward Work Packages will investigate how the properties of this domain could be exploited to create novel functional molecules for transformative bioscience and biotechnology applications. The ability of the domain to be labelled in the unfolded state enables a broad range of chemical functionalisation and "caging" of functional groups to create responsive molecules. The ability to array domains without enhancing misfolding or inducing aggregation will enable production of protein scaffolds with high binding avidity or/or specificity. Lastly, efficient refolding of arrayed and functionalised domains on a surface (when combined with WPs 1 and 2) would enable production of regenerable and responsive biosensors.

We have discovered many other aggregation-resistant domains, thus this proposal will act as a template for future broad applications of these proteins.

Planned Impact

Our Work Packages are designed to provide stringent tests of our ability to exploit a novel aggregation-resistant and highly engineerable protein domain to create molecules for novel applications in bioscience.

The proteins we are working on will also have impact in basic bioscience in several areas, including:

1) As a chemically modifiable tool for imaging and other cell-based assays.
Assays would include contexts where the ability to detect and illuminate the presence of many different molecules in a highly specific manner is key.
We believe our work will demonstrate that the SHIRT domain can be chemically modified in the unfolded state and then refolded to reform a functional domain. This capacity will enable a wide range of previous inaccessible experiments that include the ability to generate a response to a molecule not just detecting its presence.

2) For coupling to regenerable surfaces for highly specific protein-based biodetection of target ligands
Applications of SHIRT domains potentially include robust, regenerable biosensors for the ex vivo monitoring of, for example, substances in urine or blood including metabolites, hormones or pathogens, for monitoring water quality and food-borne pathogens. We will also test our ability to "cage" fluorophores or other functional groups to create molecules that can respond to (including revealing the presence of) a particular molecule in ex vivo or in vitro applications. The molecules developed in this project might be particularly useful for biosensors that are used in harsh environments (for example lacking refrigeration), which would interest researchers from low- and middle-income countries as well as those that work in challenging field environments.

Outside of basic bioscience, we envisage the work will impact companies that produce biosensors and imaging molecules and their employees. Other potential impact would be felt by the general public, who would benefit from increased sensitivity and specificity of non-antibody scaffolds in a variety monitoring applications including water and food safety. Lastly, it is important to note that the use of non-antibody scaffolds to replace antibodies is aligned with the NC3Rs objectives of replacing, reducing and refining the use of animals for scientific purposes.

The researcher employed on the application will learn new skills and will be trained in research project management in an interdisciplinary environment. Such training could provide future economic benefit as they pursue their future career in industry or academia. We will communicate our work to the general public through events such as the Festival of Ideas and Open Lectures. The work is an attractively simple example of where work in one field (understanding the structure of bacterial proteins in infection) can lead, serendipitously, to advances in an unexpected area (protein folding and aggregation) and thus will reinforce the importance of basic science in parallel with more applied approaches.


10 25 50