PhosphoX-db: A web-based bioinformatics platform for studying non-canonical phosphorylation

Lead Research Organisation: University of Liverpool
Department Name: Institute of Integrative Biology

Abstract

The key molecules that mediate cellular functions are proteins, involved in all processes of Life, including the mechanisms by which cells respond to extracellular stimuli. One of the primary methods by which signals are passed between and within cells is via chemical modification of proteins, so called post-translational modifications (PTMs). These PTMs affect how a protein performs it function, for example acting as an on-off switch or by enabling (or blocking) interactions with other proteins. Perhaps the most important PTM is the reversible addition of chemical phosphate groups, termed phosphorylation, which is predicted to occur on around half of all human proteins. Changes or mistakes in protein phosphorylation are implicated in many diseases, and particularly in cancer, where altered phosphorylation of key 'signalling' proteins is a major contributing factor to cancer progression. The enzymes responsible for protein phosphorylation are called kinases, which themselves are often activated by the addition of phosphate, in a 'domino-effect' type cascade mechanism. Ultimately, these phosphorylation 'cascades' result in the activation of proteins called transcription factors, which switch on the expression of lots of new genes, thus making new proteins so the cell can respond to the signal received. As a result, the majority of current cancer drugs on the market or in development, target kinases directly or indirectly.

For nearly half a century, we have known that three of the 20 amino acids that make up proteins are phosphorylated in vertebrates. Excitingly, we have now discovered that an additional 5 amino acids are extensively modified by phosphorylation in human cells, which we call pX proteins. These novel findings completely change our understanding of the landscape of this type of regulatory protein modification in human cells. Critically, the biological functions and regulatory mechanisms that underlie the addition and removal of this 'missing' phosphate have not yet been explained. In this project, we will develop a web-based database to make the evidence for these new types of phosphorylation straightforwardly available to other researchers. We will develop software tools to run within the database to allow us to hypothesise what functions these pX proteins perform in cells, and to allow us and other researchers to start exploring which kinases may be responsible for pX phosphorylation. These tools will also allow other researchers to apply these new findings to their own research questions, and help formulate hypotheses to be followed up in new studies.

Technical Summary

Phosphorylation is one of the most studied post-translational modifications. It has long been assumed that, unlike in bacteria, plants and fungi, vertebrate phosphorylation occurs almost exclusively on three residues: serine (S), threonine (T) and tyrosine (Y) - collectively "pSTY sites". In the past, there have been occasional reports of phosphorylation on histidine and very rarely on five other amino acids (cysteine, lysine, aspartate, arginine and glutamate) - collectively "pX sites". However, pX modifications are highly susceptible to hydrolysis at non-physiological pH and/or elevated temperature, so most biochemical techniques traditionally employed are unsuitable for pX site identification. We have recently developed an unbiased strategy for phosphopeptide characterisation, revealing widespread phosphorylation on pX sites, completely changing our understanding of phosphorylation-mediated signalling in vertebrates. Other labs have also recently provided evidence for large scale pHis protein phosphorylation.

In this project, we will develop bioinformatics tools, delivered through a new web-based database called PhosphoX-db, enabling researchers to explore pX-modified proteins identified in research from our groups and others. We will provide short linear motifs for pX sites, profile longer range domains and explore the conservation of sites (pX and pSTY) within multiple sequence alignments. We are also pioneering a computational approach to predict kinases and phosphatases responsible for pX phosphorylation, through analysis of very large scale protein interaction evidence. We will develop tools for statistical analysis of evidence for pX sites from mass spectrometry data, as well as spectrum visualisation and data provenance records, to give researchers utmost confidence in these new findings. The results will be made available in a variety of formats to assist in the integration of pX signalling into established pathways for pSTY kinase signalling.

Planned Impact

There is great potential for wide scale impact in disease research, particularly cancer. Malfunctioning signalling is implicated in many cancers, and the discovery of new roles for kinases and/or phosphatases is likely to lead to some progressing as potential drug targets. The creation of PhosphoX-db will greatly speed up the transfer of knowledge about non-canonical signalling to pharmaceutical companies. While the data and primary bioinformatics tools within PhosphoX-db will all be open access and free of charge to all, we will explore discussions with pharmaceutical companies over the provision of bespoke analyses or tools, and the protection of IP as a potential route to economic impact and future sustainability for the site.

The PDRA working on the project will benefit from exposure to a cutting-edge area of science and will ultimately interact with lots of database users from diverse fields, including academic groups and Pharma, making new international connections.

Publications

10 25 50
 
Description We have contributed data analysis and novel informatics pipelines to major study identifying widespread non-canonical phosphorylation. The main paper associated with this award has now been published in EMBO journal, revealing widespread non-canonical phosphorylation i.e. phosphorylation in mammals beyond pSTY. We have developed in house web tools, which will shortly released publicly for exploring the data more widely.
Exploitation Route The widespread discovery of non-canonical phosphorylation has potential for wide impacts on cell signalling research and fundemental bioscience understanding.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description BBSRC-NSF/BIO PTMeXchange: Globally harmonized re-analysis and sharing of data on post-translational modifications
Amount £310,483 (GBP)
Funding ID BB/S017054/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 10/2019 
End 09/2022