PTM-AI: Improving the detection and functional characterization of post-translational modifications
Lead Research Organisation:
EMBL - European Bioinformatics Institute
Department Name: OMICs
Abstract
Cells respond to stress via rapid signalling through post-translational modifications (PTMs) of proteins. Protein phosphorylation is by far the most studied PTM, although other ones are being increasingly studied as well. Mass spectrometry (MS)-based proteomics techniques are becoming increasingly central in the life sciences and personalised medicine studies, and represent the most used experimental approach for studying PTMs.
PTM-enriched proteomics datasets are complex to analyse. There is still a significant fraction of the generated mass spectra that cannot be assigned to a peptide sequence, and then remain unidentified. Regrettably then, generated data in these studies cannot be used yet to its full potential. Therefore, there is the need to develop novel analysis approaches for proteomics datasets. Beyond data analysis, a common challenge is to extract biologically and functionally relevant information from the proteomics results, including e.g. a list of detected PTMs (e.g. phosphosites). However, currently it is hard to prioritise the detected PTMs for downstream analysis, which can involve expensive follow-on studies.
Artificial Intelligence (AI) approaches including Machine Learning (ML) and Deep Learning (DL) are revolutionising proteomics, enabling improvements in many steps of the proteomics analysis workflow. These developments in AI approaches for proteomics have largely been enabled by the wide availability of datasets in the public domain. The PRIDE database (European Bioinformatics Institute, EMBL-EBI, UK) is the world-leading proteomics data repository, accounting for >80% of stored datasets worldwide. UniProt (EMBL-EBI) is the most used protein knowledge-base and it is increasingly incorporating PTM data, including information about their functional relevance.
In this proposal called PTM-AI we will use AI to further leverage the huge amount of public proteomics datasets to improve the detection and functional characterization of PTMs. PTM-AI includes the teams in charge of the world-leading resources PRIDE and UniProt, and two International groups active in AI approaches for proteomics: the Beltrao (Switzerland) and Renard/Schlaffner groups (Germany).
PTM-enriched proteomics datasets are complex to analyse. There is still a significant fraction of the generated mass spectra that cannot be assigned to a peptide sequence, and then remain unidentified. Regrettably then, generated data in these studies cannot be used yet to its full potential. Therefore, there is the need to develop novel analysis approaches for proteomics datasets. Beyond data analysis, a common challenge is to extract biologically and functionally relevant information from the proteomics results, including e.g. a list of detected PTMs (e.g. phosphosites). However, currently it is hard to prioritise the detected PTMs for downstream analysis, which can involve expensive follow-on studies.
Artificial Intelligence (AI) approaches including Machine Learning (ML) and Deep Learning (DL) are revolutionising proteomics, enabling improvements in many steps of the proteomics analysis workflow. These developments in AI approaches for proteomics have largely been enabled by the wide availability of datasets in the public domain. The PRIDE database (European Bioinformatics Institute, EMBL-EBI, UK) is the world-leading proteomics data repository, accounting for >80% of stored datasets worldwide. UniProt (EMBL-EBI) is the most used protein knowledge-base and it is increasingly incorporating PTM data, including information about their functional relevance.
In this proposal called PTM-AI we will use AI to further leverage the huge amount of public proteomics datasets to improve the detection and functional characterization of PTMs. PTM-AI includes the teams in charge of the world-leading resources PRIDE and UniProt, and two International groups active in AI approaches for proteomics: the Beltrao (Switzerland) and Renard/Schlaffner groups (Germany).