Mass spectrometry based structural proteomics

Lead Research Organisation: University of Oxford
Department Name: Oxford Chemistry

Abstract

One of the major challenges in biological science is to capitalise on the wealth of genomic information arising from DNA sequencing and characterise the structure and dynamics of the encoded proteins. However, while genome sequencing has become very rapid, no methods currently exist which can experimentally determine the molecular details of the gene-products on a comparable timescale. As such the gulf between our understanding of what proteins are present in an organism, how they orchestrate the cellular processes necessary to life, and their malfunction in disease is ever increasing.

Mass spectrometry (MS) is a traditional physical chemistry approach that has revolutionised the experimental identification of proteins and quantification of their cellular abundances. The success of such MS-based 'proteomics' is underpinned by the robust automation of both data acquisition and analysis. Far from being limited to identifying proteins, MS has recently emerged as an exciting technique for characterising the molecular details of the cellular machines they assemble into. This novel application of MS has allowed the experimental elucidation of the stoichiometry, architecture, and dynamics of protein assemblies. We propose to bridge the technological gap between these two fields of proteomics and structural biology by developing data analysis software to enable the high-throughput characterisation of protein assemblies by means of MS.

Technical Summary

We propose to develop data analysis algorithms and software for the new types of data emerging from 'native' mass spectrometry (MS), an experimental approach with rapidly growing impact in structural biology. The advantages of native MS in characterising the molecular details of protein assemblies are significant, and centre on the speed, sensitivity and generality of the technique. As such, picomole quantities of proteins that are associated with membranes, have regions of intrinsic disorder, or are complicated by polydispersity can all be successfully interrogated on the minute timescale. These native MS experiments can reveal the stoichiometry of the protein assemblies and associated ligands, their oligomeric architecture, and equilibrium fluctuations.

Despite the utility of native MS it is our contention that the technique has yet to fulfil its potential in structural biology. In comparison with MS-based proteomics, in which the full capabilities of the spectrometers are exploited in highly automated experimental and data analysis approaches, the translation of native MS data into structural biology results remains a significant bottleneck. The primary reason for this is due to the lack of automated means for the robust and reliable interpretation of this new type of data. We will build on our proof-of-principle studies to develop modular and integrated software to automatically extract the oligomeric distribution of proteins, the rate constants of their inter-conversion, and even their likely architecture.

As such we will enable native MS as a tool for high-throughput structural proteomics. Furthermore, our software will allow us to simultaneously obtain figures-of-merit for native MS, and thereby determine rigorous requirements for data integrity. In this way we will address a significant gap in the field of native MS and derive statistically appropriate data standards to inform the delineation of future data deposition protocols.

Planned Impact

The proposed work is not only directly relevant to the Tools and Resources Development Fund call for the development of computational approaches for the biosciences, but will also have practical importance within academia, industry, and ultimately on human health. The research is directly applicable to five BBSRC strategic priority areas: 'technology development for bioscience', 'data-driven biology', 'systems approach to biological research', 'ageing research: lifelong health and wellbeing', and 'increased international collaboration'.

The algorithms and software we propose to develop will have a profound impact on those parties interested in determining the structure and dynamics of proteins, and the influence of ligand binding on these properties. Specifically, the ability to robustly and rapidly screen the binding of candidate drugs, determining their stoichiometry of interaction, binding constants, and impact on structure will be of great interest to pharmaceutical companies. Similarly, there is significant potential in using MS approaches to screen biopharmaceuticals for quality control. For these industrial beneficiaries to embrace such MS strategies requires the experimental technique to be complemented by reliable data analysis software as we propose to develop here.

Our work is also of direct interest to manufacturers of MS instrumentation, in which the UK is a world leader. The market for MS in the biosciences is continually expanding, and relies on breakthroughs into novel application areas. The current interest in MS for protein structure determination is primarily within academia, due in large part to the absence of established data analysis approaches and associated standards. This represents a significant technological gap that our proposal will go some way towards bridging.

In general modern bioscience is becoming increasingly data-driven. For example, in the fields of genomics and proteomics, perhaps the most significant challenge is mining the large volumes of data generated to extract information as to the workings of the cell. The high-throughput nature of the pipeline we propose allows us to envisage the annotation of genomic databases with structural and dynamical insights obtained by means of MS. Furthermore, our software promises to allow the quantitative determination of the biophysical parameters that govern native or aberrant protein assembly, and thereby elucidate molecular differences between healthy and diseased states.

Publications

10 25 50
 
Description Improved means of analysing mass spectrometry data: collision cross-sections, cross-linking MS data, and the raw native MS data.
Exploitation Route Lots of people use the technology and this provides a means for them to better analyse their data and interpret the results
Sectors Manufacturing

including Industrial Biotechology

Pharmaceuticals and Medical Biotechnology

 
Description Involvement with industry - software release to the community, and licensing from industry
Sector Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description Confidence in concept
Amount £23,000 (GBP)
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 05/2014 
End 12/2014
 
Description Impact acceleration
Amount £22,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 06/2016
 
Description Impact acceleration
Amount £8,000 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 08/2016
 
Description Vierling lab 
Organisation University of Massachusetts
Country United States 
Sector Academic/University 
PI Contribution Long-term collaboration - exchange of expertise and reagents, and co-authorship
Collaborator Contribution Long-term collaboration - exchange of expertise and reagents, and co-authorship
Impact See publications
 
Title Degiacomi-Lab/biobox: Biobox - v1.1.1 - minor fixes 
Description Minor fixes to setup.py to allow python setup.py install actually install biobox in site-packages and python setup.py build_ext --inplace now does what python setup.py install did previously. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact We present Biobox, a Python-based toolbox facilitating the implementation of biomolecular modelling methods. 
URL https://zenodo.org/record/6567197
 
Title DynamXL 
Description Enables chemical cross-linking modelling on protein structures 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Academic users 
 
Title EMnIM 
Description Software to enable comparison between electron microscopy and ion mobility data 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Users in academia and industry 
 
Title Impact 
Description Collision cross-section calculations optimised for structural biology 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact Users in academia and industry 
 
Title UniDec 
Description Deconvolution software for mass spectrometry 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact Users in both academia and industry 
 
Description Bratislava Childrens University 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact Outreach presentation to school children from across Slovakia
Year(s) Of Engagement Activity 2014
 
Description MPLS blog - paralog Science paper 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Press release in blog format regarding high profile paper
Year(s) Of Engagement Activity 2018
URL https://www.mpls.ox.ac.uk/news/proteins-assemble-study-sheds-new-light-on-our-biochemical-workhorses
 
Description School visit (Montessori) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Outreach talk to school children aged 8-13
Year(s) Of Engagement Activity 2015
 
Description Twitter 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Twitter account highlight research and related areas of interest
Year(s) Of Engagement Activity 2012
URL https://twitter.com/beneschresearch