The dynamic ensemble: exploring accessible conformational space

Lead Research Organisation: University of Leicester
Department Name: Biochemistry

Abstract

The functioning of the cells in our bodies is dependent upon the proper assembly, in time and location, of (often complicated) molecular complexes, that nearly always contain proteins. To understand the functioning of these so-called molecular machines, it is imperative that we know how they 'look', i.e. we need to know the arrangement of all the atoms that comprise the protein molecule in three-dimensional (3D) space. Moreover, these arrangements are not static; the protein molecules vibrate, deform and adapt their shape in response to their environment over time. It has become clear that these dynamic adaptations are key to the functional roles of these proteins and hence this has generated a profound interest in ways to study such dynamical properties.

Structural biology is a field of biological science that studies 3D biomolecular structures, their dynamics and their interactions. Knowledge of these structures is essential in many areas of science and helps us understand the molecular basis of life and disease processes, design better drugs, improve the efficiency of enzymes used in the food, paper or agriculture industry, etc.
X-ray crystallography and Nuclear Magnetic Resonance (NMR) spectroscopy are the two techniques that produce the overwhelming majority of biomolecular structures. Of the more than 100000 entries in the Protein Data Bank (PDB), a global repository for biomolecular structures, about 86% were determined using X-ray crystallography, while 13% come from NMR. X-ray structures are typically determined at a low temperature (-173 degree C) in a crystalline state and the technique does not naturally reveal information regarding dynamic properties. In contrast, the NMR data are acquired at room temperature in solution and are inherently affected by dynamic averaging processes occurring on a large range of time scales. This presents both a nuisance and an opportunity.

This proposal aims to develop technology to extract knowledge about protein dynamics from easily available NMR data. We postulate that this information is inherently sufficient to derive a dynamically-representative structural ensemble that better accounts for both the experimental data and the actual protein conformation. Specifically, we plan to:
1. Design and test a software pipeline for extraction of this dynamical information about proteins from readily available NMR data and convert it into an easily visualized form.
2. Conduct a large-scale re-computation using the pipeline of the NMR-derived protein structures contained in the PDB.
3. Test and explore the effect of small molecule docking using dynamically-representative structural ensembles.
4. Setup a generally accessible server for the NMR community to execute the pipeline on their own projects.

The structural information is deposited in the PDB by academic and industrial researchers from all over the world. We think that the better representation of dynamical information will increase the value of this data, yielding higher quality and more relevant result for usage by other scientists to advance our knowledge and understanding of human health, drug discovery, agriculture, etc.

The project will be conducted in Leicester under the supervision of Prof. Vuister, a well-known expert in the field. Leicester's integrated structural biology research environment provides for a great scientific infrastructure and support. In addition, numerous other experts are connected to and support the project as well, thus extending its impact.

Technical Summary

The ability to define structural conformational heterogeneity of biomolecules is paramount for understanding their functionality and also of crucial importance in structure-based drug design. Nuclear magnetic resonance (NMR) has emerged as the most appropriate technique yielding dynamic information on time scales ranging from nanosecond to millisecond.
We propose an approach for a NMR dynamical analysis that is based on chemical shifts, which are exquisite probes for structural mobility, and NOEs, which are intrinsically dynamically averaged. We postulate that this information is inherently sufficient to derive a dynamically-representative structural ensemble that better accounts for both the experimental data and the actual protein conformation.

The overall aim of this proposal is to identify, extract and describe the dynamical properties of proteins from readily available NMR data. Specifically, we aim to:
1. Design and test a pipeline for extraction of dynamical conformational heterogeneity of proteins from readily available chemical shift and NOE NMR data and encode that information in dynamically-representative ensembles.
2. Conduct a large-scale re-computation using the pipeline of the NMR-derived protein structures contained in the PDB.
3. Test and explore the effect of small molecule docking using dynamically-representative structural ensembles.
4. Setup a generally accessible server for the NMR community to execute the pipeline on their own projects.

The project will build upon our extensive experience in NMR protein structure computation and validation. We will build the pipeline upon the existing CING software framework, employing the recently proposed NMR exchange format for data interchange.
We expect the outcomes of the project to significantly enhance our understanding of biomolecular functioning and to yield valuable resources for the structural biology, drug-discovery and biological scientific communities.

Planned Impact

The project fits well with BBSRC's strategic objectives to support bioscience, particularly relating to the fields of biomedicine and drug development. The project outcomes will underpin our molecular understanding of the mechanisms by which proteins exert their function. Thus, the findings will likely contribute to the discovery of new therapeutics, diagnostic techniques or compounds that promote health and well-being. The project also fits well within BBSRC's desire to support fundamental discoveries, in particular in relation to the aspects of structural biology and technology development; and the advanced bioscience research proposed here serves as an example of the acknowledged imperative for the UK to be a leader in the global knowledge economy.

The impact extends to a number of areas:
1. Supporting academic research excellence. The development of proper computational protocols that capture the full breath of the biomolecules' dynamical conformational space is crucial to the continuing development of the field, out understanding of biomolecular interactions and drives the development of novel drugs and therapeutics. Thus, the project underpins multiple areas of crucial biological research (detailed in the academic beneficiaries section).

2. Support for industrial innovation and research capacity.
On-demand access to highly specialised technology and accompanying expertise is crucial for many forms of industrial innovation. Through implementation of a server to access the computational pipeline developed in this project, managed by a commercial company acting as intermediary for pay-per-usage access, the minimization of costs for the industry, proper security and compliance with ISO standards is assured.

3. Providing a scientifically well-trained professional workforce. The presence of a well-educated and technologically skilled workforce is crucial for societal development. This proposal provides and stimulates direct training in structural biology, biological computer sciences and biosciences research in general, for graduate students and post-docs who will soon contribute to the core of the UK technological workforce. The project involves advanced programming and software development technology, concomitant with a profound understanding of biomolecular structure and interactions. Through the network of the PI, this knowledge will effectively propagate (see pathways to impact), to benefit the wider scientific community.

4. Driving technological developments that benefit the UK population through improved health-care.
Structure-based drug discovery is increasingly becoming essential to the UK and world-wide pharmaceutical industry in their quest to find new drugs to combat a number of very serious diseases, such as cancer, neurodegenerative diseases and anti-bacterial resistance. The improved dynamical structural ensembles likely will enhance the success of in-silico docking procedures involving small molecules, thereby accelerating the drug-discovery process and ultimately resulting in improved health-care for the UK population.

5. Education of the general public.
Many movies on YouTube attest to the notion that biomolecular structures are fun to look at. Interestingly, nearly always these concern dynamic pictures of biomolecules, the data for which is generally inferred from static views. This proposal is aimed at providing a firmer basis for this dynamical picture. Together with the images generated by many other techniques; e.g. electron microscopy, optical microscopy, this forms the full picture of our cells and organs on different size scales. This understanding of the hierarchical organisation and interplay molecules, cells and organs in an organism is essential for our modern understanding of biology and forms another direct entrance to engage the general public with the relevance and need for modern-day approaches to biomedical research.

Publications

10 25 50
 
Description The project has now yielded results currently processed for publication.
Exploitation Route Findings will provide for a basis for continuing research.
Sectors Pharmaceuticals and Medical Biotechnology