Development of Molecular Docking Software utilising GPGPU's

Lead Research Organisation: University of Bristol

Department Name: Biochemistry

Abstract

The targets of many drugs are protein molecules. Such drugs bind to a particular protein (the drug target) and this interaction affects the function the protein plays. These target proteins may be ones vital to the function an infectious micro-organism, an example would be the binding of AZT to the HIV reverse-transcriptase protein, stopping the virus hijacking the human cellular machinery to replicate its own genetic material. Or the drug target might be a human protein, an example being the binding of the antidepressant Prozac to the serotonin receptor, preventing this receptor from removing serotonin and so boosting serotonin levels and hence the patients mood. In many cases, an appropriate drug target protein is known for a particular disease in which case scientists can develop assays (tests) to determine if a small drug-like molecule can bind to the target protein and appropriately affect its function. Under these propitious circumstances, pharmaceutical companies may perform a "high throughput screen" whereby robotic equipment is used to test many hundreds of thousands of real small drug-like molecules using the assay. Where successful this process can identify so-called "lead molecules" which are the starting point for further development and assessment to produce an effective drug. Unsurprisingly, high throughput screening is a costly, time-consuming process and, of course, can only screen those compounds that they actually posses, whereas the number of possible chemical compounds that could exist is much larger. Hence scientists for several decades have tried to use cunning computer programs to try and perform something similar to a high throughput screen in a computer. Good progress was made in this endeavour during the 1990's leading to two basic approaches, virtual screening which looks at chemical similarities between molecules and molecular docking whereby the small molecules are "docked" into known protein structures (like a ship docking to a specific place in a port) and the affinity (or stickiness) of the small molecule to the protein is estimated. However, progress has been modest in the last decade for two related reasons. Firstly the number of docking poses (small-molecule positions) that can be tested is dependent on computing power. Secondly, assessing the binding affinity by computation (stickiness) is intrinsically difficult (no one has invented a very good quick way to do this) and this is also dependent on computing power. The purpose of this grant is to develop software to exploit a new type of processing hardware which is normally used in modern computers for handling the display graphics, the Graphics Processing Unit (GPU). The advantage offered by a GPU is the large number (hundreds to thousands) of simple processing cores that it contains. This design of hardware is extremely efficient when the computing problem involves many very similar but independent calculations such as calculating the pixel properties on a screen. The same situation pertains in molecular docking calculations when many independent calculations of the positions of atoms, and the interaction between them, must be performed. Our preliminary experiments show that we can speed up molecular docking at least ten fold and still use the same amount of power that a conventional Central Processing Unit (CPU) would use. This, in turn, allows us to exploit a new way of calculating the binding energy (stickiness) of the small drug-like molecule to the protein target of interest that we have invented. By the end of the grant we will have a fully functional Molecular Docking program that will exploit GPU technology to share with other researchers. Also we will have assessed whether our new method of calculating binding energies is good enough to warrant further development. The ability to accurately screen a large number of small drug-like molecules against a protein target would be a significant step forward for drug discovery.

Technical Summary

Understanding and hence prediction of the interactions between molecules is both of fundamental importance in biology at the molecular scale and of great practical use in drug discovery. The direct calculation of the binding free energy of a ligand such as a drug-like molecule to a receptor such as a protein is still a challenging problem. This is due to the accuracy required for calculating the potential energy of states (typically an atomistic molecular mechanics forcefield is used, perhaps describing regions of the system with quantum mechanics) and the large number of states that must be sampled (particularly when atomistic water models are used) to generate both the enthalpic and entropic components of the free energy of binding. At the other end of the scale lie fast methods like virtual screening (mostly based on pharmacophore matching) and molecular docking, where the ligand binding pose is predicted and the interaction between ligand and protein is assessed by a fast scoring function.
The research proposed here aims to improve the performance of the molecular docking method of ligand binding prediction. We have invented a an atom-atom based free energy forcefield which is designed to predict the free energy of binding, in an analogous fashion to the calculation of potential energies in a molecular mechanics forcefield. Crucially, the change in solvation energy on complex formation is captured via an atom pairwise energy function. We are employing GPGPU technology to accelerate the docking since the many independent pose calculations and independent atom-atom calculations map well to GPGPU architecture. The work builds on previous research in our laboratory. By the end of this project a GPGPU accelerated molecular docking program (BUDE) for flexible protein-ligand and protein-protein docking will be available to the academic community (written in C++ and OpenCL). Validation of the empirical free energy forcefield approach will have been performed.

Planned Impact

There are at least 10 research groups spread over four Departments or Schools within Bristol University alone that have real-world problems that could benefit from better molecular docking software, and all are keen to use the product of this research. These are all experimental groups working to understand the structure and mechanisms of biochemical processes at the atomic and molecular level, or seeking to find small molecule inhibitors of biochemical processes. Such experiments are used to validate drug targets and may provide the very first steps taken in the drug discovery process. Hence we believe the impact of success in this project to be profound. A number of commercial companies are already expressing interest in this research, ranging from large pharmaceutical enterprises - Novartis, to commercial molecular modelling software providers - Cresset, and large computer companies - Apple. The combination of greater sampling and a more detailed approach to the estimation of binding affinities are both enabled by GPGPU technology. Our strategy for providing better molecular docking software, BUDE, is based on exploiting both of these factors.
Software that facilitates basic biochemical research by helping experimental scientists better understand and predict the interaction between molecules will have an incalculable benefit. Software that has the potential to kick-start the drug discovery process by reducing the time spent to find hit molecules and aid the development of these by medicinal chemistry to lead molecules and further to fully fledged drug candidates has clear benefits, both in reducing the cost of the development of new drugs and allowing a more rapid response to pandemic diseases. All of these outcomes have enormous societal and economic benefits.

Funded Value:

£120,391

Funded Period:

Sep 12 - Feb 14

Funder:

BBSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

BB/K004050/1

Principal Investigator:

Richard Sessions

Research Subject:

Bioengineering (20%)

Biomolecules & biochemistry (60%)

Tools, technologies & methods (20%)

Research Topic:

Biophysics (20%)

Catalysis & enzymology (20%)

Protein engineering (20%)

Protein folding / misfolding (20%)

Theoretical biology (20%)

Organisations

People	ORCID iD
Richard Sessions (Principal Investigator)
Noah Linden (Co-Investigator)
Simon McIntosh-Smith (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Brogan AP (2014) Molecular dynamics simulations reveal a dielectric-responsive coronal structure in protein-polymer surfactant hybrid nanoconstructs. in Journal of the American Chemical Society

McIntosh-Smith S (2014) High performance in silico virtual drug screening on many-core processors in The International Journal of High Performance Computing Applications

Pérez B (2018) Insight into the molecular mechanism behind PEG-mediated stabilization of biofluid lipases. in Scientific reports

Shoemark DK (2018) Intraring allostery controls the function and assembly of a hetero-oligomeric class II chaperonin. in FASEB journal : official publication of the Federation of American Societies for Experimental Biology

Shoemark DK (2018) The dynamical interplay between a megadalton peptide nanocage and solutes probed by microsecond atomistic MD; implications for design. in Physical chemistry chemical physics : PCCP

Smith SA (2019) Antiproliferative and Antimigratory Effects of a Novel YAP-TEAD Interaction Inhibitor Identified Using in Silico Molecular Docking. in Journal of medicinal chemistry

Wood CW (2014) CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies. in Bioinformatics (Oxford, England)

Wood CW (2017) ISAMBARD: an open-source computational environment for biomolecular analysis, modelling and design. in Bioinformatics (Oxford, England)

Key Findings
Impact Summary
Collaboration
Software and Technical Products


Description	A fast method for more accurate virtual screening to facilitate drug discovery
Exploitation Route	Used already in Drug discovery and protein design. Further methodological development is required but so far unfunded
Sectors	Digital/Communication/Information Technologies (including Software),Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology


Description	Several academic and commercial groups are evaluating the docking software
First Year Of Impact	2009
Sector	Digital/Communication/Information Technologies (including Software),Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types	Societal,Economic


Description	PoPPI
Organisation	University of Leeds
Country	United Kingdom
Sector	Academic/University
PI Contribution	The groups of Sessions and Woolfson are providing and developing the software tools BUDE and ISAMBARD to provide the molecular modelling component of a collaborative EPSRC Programme Grant with the University of Leeds to search for small molecules and small-molecule scaffolds for the perturbation of protein-protein interfaces
Collaborator Contribution	A. Wilson (PI) Biochemistry and assays A. Nelson Small molecule synthesis T. Edwards Crystallography
Impact	see website
Start Year	2016


Title	BUDE
Description	Molecular docking program accelerated with GPU's
Type Of Technology	Software
Year Produced	2014
Impact	Used in protein prediction software Used in pilot National Compound Database study funded by RSC Used for prediction of binders to therapeutically interesting proteins for hit discovery
URL	http://www.bristol.ac.uk/biochemistry/research/bude

Abstract

Technical Summary

Planned Impact

Organisations

People

ORCID iD

Publications