PTM-AI: Improving the detection and functional characterization of post-translational modifications
Lead Research Organisation:
European Bioinformatics Institute
Department Name: OMICs
Abstract
Cells respond to stress via rapid signalling through post-translational modifications (PTMs) of proteins. Protein phosphorylation is by far the most studied PTM, although other ones are being increasingly studied as well. Mass spectrometry (MS)-based proteomics techniques are becoming increasingly central in the life sciences and personalised medicine studies, and represent the most used experimental approach for studying PTMs.
PTM-enriched proteomics datasets are complex to analyse. There is still a significant fraction of the generated mass spectra that cannot be assigned to a peptide sequence, and then remain unidentified. Regrettably then, generated data in these studies cannot be used yet to its full potential. Therefore, there is the need to develop novel analysis approaches for proteomics datasets. Beyond data analysis, a common challenge is to extract biologically and functionally relevant information from the proteomics results, including e.g. a list of detected PTMs (e.g. phosphosites). However, currently it is hard to prioritise the detected PTMs for downstream analysis, which can involve expensive follow-on studies.
Artificial Intelligence (AI) approaches including Machine Learning (ML) and Deep Learning (DL) are revolutionising proteomics, enabling improvements in many steps of the proteomics analysis workflow. These developments in AI approaches for proteomics have largely been enabled by the wide availability of datasets in the public domain. The PRIDE database (European Bioinformatics Institute, EMBL-EBI, UK) is the world-leading proteomics data repository, accounting for >80% of stored datasets worldwide. UniProt (EMBL-EBI) is the most used protein knowledge-base and it is increasingly incorporating PTM data, including information about their functional relevance.
In this proposal called PTM-AI we will use AI to further leverage the huge amount of public proteomics datasets to improve the detection and functional characterization of PTMs. PTM-AI includes the teams in charge of the world-leading resources PRIDE and UniProt, and two International groups active in AI approaches for proteomics: the Beltrao (Switzerland) and Renard/Schlaffner groups (Germany).
PTM-enriched proteomics datasets are complex to analyse. There is still a significant fraction of the generated mass spectra that cannot be assigned to a peptide sequence, and then remain unidentified. Regrettably then, generated data in these studies cannot be used yet to its full potential. Therefore, there is the need to develop novel analysis approaches for proteomics datasets. Beyond data analysis, a common challenge is to extract biologically and functionally relevant information from the proteomics results, including e.g. a list of detected PTMs (e.g. phosphosites). However, currently it is hard to prioritise the detected PTMs for downstream analysis, which can involve expensive follow-on studies.
Artificial Intelligence (AI) approaches including Machine Learning (ML) and Deep Learning (DL) are revolutionising proteomics, enabling improvements in many steps of the proteomics analysis workflow. These developments in AI approaches for proteomics have largely been enabled by the wide availability of datasets in the public domain. The PRIDE database (European Bioinformatics Institute, EMBL-EBI, UK) is the world-leading proteomics data repository, accounting for >80% of stored datasets worldwide. UniProt (EMBL-EBI) is the most used protein knowledge-base and it is increasingly incorporating PTM data, including information about their functional relevance.
In this proposal called PTM-AI we will use AI to further leverage the huge amount of public proteomics datasets to improve the detection and functional characterization of PTMs. PTM-AI includes the teams in charge of the world-leading resources PRIDE and UniProt, and two International groups active in AI approaches for proteomics: the Beltrao (Switzerland) and Renard/Schlaffner groups (Germany).
Publications
Omenn GS
(2024)
The 2024 Report on the Human Proteome from the HUPO Human Proteome Project.
in Journal of proteome research
| Title | Availability of PTM (post translational modification) data in UniProt |
| Description | Via the PTMeXchange Consortium we aim to link PTM data as shown in UniProt to the original mass spectrometry (MS) evidence in proteomics data repositories such as PRIDE. After the original PTMeXchange grant finished, we are continuing to integrate PTM data in UniProt using the PTM-AI grant. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2023 |
| Provided To Others? | Yes |
| Impact | Scientist can start to access now reliable PTM data (phosphorylation, but also soon ubiquitination, acetylation, SUMOyliation and methylation) in UniProt |
| URL | https://www.uniprot.org/ |
