Applying Machine Learning to Computational Preprocessing of Mass Spectrometry Data
Lead Research Organisation:
University of Glasgow
Department Name: School of Computing Science
Abstract
The use of Machine Learning (ML) approaches for real time peak detection and processing in LC/MS data.
People |
ORCID iD |
Simon Rogers (Primary Supervisor) | |
Ross McBride (Student) |
Publications


Wandy J
(2022)
ViMMS 2.0: A framework to develop, test and optimise fragmentation strategies in LC-MS metabolomics
in Journal of Open Source Software
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/R513222/1 | 30/09/2018 | 29/09/2023 | |||
2441557 | Studentship | EP/R513222/1 | 30/09/2020 | 31/03/2024 | Ross McBride |
EP/T517896/1 | 30/09/2020 | 29/09/2025 | |||
2441557 | Studentship | EP/T517896/1 | 30/09/2020 | 31/03/2024 | Ross McBride |
Description | During the research new mass spectrometry fragmentation strategies have been developed. Mass spectrometry is an extremely commonly used technique in chemical analysis, deployed commonly in many laboratories, and the fragmentation strategy controls what data it collects from a sample injected into it. Based on these data, we can attempt to identify the chemicals in that sample. Biological samples in particular are complex and rich with chemical diversity, so the fragmentation strategy must make appropriate decisions to collect the most useful data. Any improvement in this area implies potential improvements for all downstream biochemical applications which require accurate annotation of the molecules in a sample. New fragmentation strategies based on RoI (Region of Interest) area overlap and revisiting areas of increased intensity were developed and shown to improve the number of items of interest (defined by an external, standard software) we could acquire pertinent data for, and at better acquisition times, improving the output data quality. This work is described in "topNEXt: TopNEXt: Automatic DDA Exclusion Framework for Multi-Sample Mass Spectrometry Experiments" (publicly available preprint, under review). As part of verifying these claims these methods were implemented into the open-source Virtual Metabolomics Mass Spectrometer framework (ViMMS) and extensions to it were developed, including revising the evaluation component of ViMMS which reports on how successful a method is. These new developments contributed to "Simulated-to-real Benchmarking of Acquisition Methods in Metabolomics" (published), which compares different families of existing fragmentation strategies. The results and software are openly available and can in theory be used by any lab with a compatible mass spectrometer. |
Exploitation Route | Source code and method details are public, so they can be used for further development and benchmarking of fragmentation strategies. Alternatively the results could be used as-is by any lab with a compatible mass spectrometer (Thermo Fisher IAPI instruments) - compatible instruments may expand in future. For example, mass spectrometry has been used in metabolomics for disease diagnosis, cancer research and nutrition research. Better data quality could aid these applications or others in making advances. |
Sectors | Agriculture Food and Drink Chemicals Pharmaceuticals and Medical Biotechnology |
Description | Closed Loop (Metabolomics) EPSRC EP/R018634/1 |
Organisation | University of Glasgow |
Department | Polyomics Facility |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Produced several research and software items relevant to the goals of the Closed Loop Metabolomics project - see publications for most notable examples. |
Collaborator Contribution | Produced shared pool of data usable both by work in the direct scope of Closed Loop and this studentship. Contributed research ideas/software expertise to this studentship. |
Impact | The studentship was undertaken in close collaboration with the Closed Loop EPSRC grant so essentially all of the outcomes can be linked to it in some way. The Closed Loop Metabolomics project itself is inherently multi-disciplinary, including computing scientists, (computational) biologists, statisticians and mass spectrometrists. |
Start Year | 2020 |
Title | ViMMS 2.0: A framework to develop, test and optimise fragmentation strategies in LC-MS metabolomics |
Description | A framework to develop, test and optimise fragmentation strategies in LC-MS metabolomics |
Type Of Technology | Software |
Year Produced | 2023 |
Open Source License? | Yes |
Impact | N/A |
URL | https://zenodo.org/record/7724728 |