Adaptive Multi-Resolution Massively-Multicore Hybrid Dynamics

Lead Research Organisation: University of Bristol
Department Name: Chemistry

Abstract

We propose to develop highly scalable software that will exploit next generation, heterogeneous, massively parallel processors (such as those found in widely available graphics processors - GPUs) to deliver orders-of-magnitude performance increases for conformational sampling in molecular simulations. The software will be generally applicable to simulations of any condensed phase molecular system. The initial application area will be to accelerate the sampling of protein conformational change within the types of simulation used for rational drug design in the pharmaceutical industry.Future applications of rational drug discovery will depend critically on the ability to model protein conformational change and protein flexibility. Previous successful applications of computational methods in rational drug design targeted proteins that had small, well-defined binding pockets, in proteins that were either relatively rigid, or changed little upon drug binding. Increasingly, medicinally interesting protein targets have large, open and flexible binding sites. To understand binding, computational models have to be able to predict how these sites will change shape upon drug binding. Coupled to this, a new generation of drugs are being developed that target the interactions between protein surfaces, or that require modelling of protein-protein association. In these cases, the binding site is extremely dynamic, as it is formed between two (or more) proteins that have come together. Existing molecular modelling algorithms and software are incapable of stepping up to the challenge of modelling highly flexible proteins. New software and new algorithms are needed urgently to ensure that computational science continues to play an important role in the pharmaceutical industry.We have designed a new multi-resolution algorithm that will allow for the simulation of molecular dynamics to be broken into two parts; a near-field, atomistic part, and a far-field, coarse grain part. The near-field part is used to model the interactions between neighbouring molecules, using traditional atomistic forcefields, and uses a standard Monte Carlo (MC) algorithm to model the dynamics of individual atoms. The far-field part models the remaining molecular interactions using a coarse-grain (beaded) forcefield, and uses rigid-body dynamics to model global dynamics (e.g large-scale protein conformational change). This multi-resolution split of both the dynamics, and the modelling of the molecular interactions, makes the algorithm ideally suited to heterogeneous computing platforms such as supercomputers equipped with numerical accelerators (e.g. graphics processors). In addition, the software will also be energy-aware, as the energy cost of performing each part of the simulation will be factored into the decision as to which resource it is allocated. For example, if the results of the simulation were not needed immediately, then the simulation could be diverted from the accelerator, and instead run using low-power processors (e.g. clusters of Intel Atoms, like those found in netbooks). This would give the simulator the choice of minimising the total simulation runtime or the total CO2 cost. While developed for the clusters of today, the software will readily scale to the peta- and exascale supercomputers of tomorrow, where concepts such as software adaptability, energy management and fault-tolerance will be key to achieving efficient scaling and efficient supercomputer utilisation. We hope that one of the lasting impacts of this project will be a promotion of greater understanding of energy-aware algorithms and CO2/energy-aware scheduling in the international HPC community. Our intention is to tackle head-on the issues facing the international HPC community in increasing yet variable energy cost and availability, and the need to significantly improve the energy efficiency, and reduce the environmental cost of HPC.

Planned Impact

The software developed during this research will be unique. It will be the first HPC multi-resolution hybrid dynamics program, and it will enable simulations of conformational change that had hereto been accessible only to a single specialised molecular dynamics program (Desmond) running on a unique, custom-designed special-purpose computer platform (Anton). The immediate impact of this software will therefore be to give scientists around the World the ability to run the equivalent of long-timescale molecular dynamics trajectories using widely-available graphics processors (GPUs), and/or using local or national HPC resources. The software building blocks developed during this project will be released as a separate library, enabling users in industry and academia to develop other novel hybrid dynamics programs. In addition, to broaden the impact to the wider community, we plan in stage 2 to port these building blocks into widely available molecular modelling frameworks (e.g. OpenMM) and will work with developers of existing dynamics software to see how these building blocks could be used to implement accelerated multi-resolution hybrid dynamics algorithms in existing codes. This will ensure that the impact and benefits of this software will be available to the widest possible international community of industrial and academic molecular modellers. The immediate beneficiaries of this software will be industrial and academic scientists interested in accounting accurately for protein conformational change within thermodynamic simulations, such as those used to predict protein-drug or protein-protein binding. This will be of great benefit to industrial scientists working in the pharmaceutical industry studying flexible protein-drug systems. The software and algorithms developed feed nicely into the International Exascale Software Project (IESP) of which SMS is an active member. IESP's mission is to stimulate the creation of new algorithms and methods that will scale to the first exascale systems due to go online in 2018. In addition, the energy-aware, fault tolerant queuing system developed as part of this project is highly novel, and would be of great use to international developers of HPC software. We will release this system as an open source library, with sufficient documentation to allow it to be used by other developers for a wide variety of HPC applications. The ability to create an energy/CO2-aware cloud computing platform provides a unique point of difference in an emerging and rapidly growing market. We would therefore seek to partner with cloud computing providers to allow commercialisation of the energy/CO2-aware cloud computing components, e.g. via licensing, or by forming a spin- off company that would handle support, provide access to cloud computing resources, and provide the enabled software (such as the hybrid dynamics software) via a software-as-a-service (SaaS) web-based portal. This project has the potential to put orders of magnitude more performance, performance per dollar and performance per watt into the hands of molecular modellers in industry and academia by significantly increasing the rate of adoption of GPU-based accelerators. The commercial and scientific benefits of this step-change in capability are potentially enormous for industry and academia. We hope that one of the lasting impacts of this project will be a promotion of greater understanding of energy-aware algorithms and CO2/energy-aware scheduling in the international HPC community. Our intention is to tackle head-on the issues facing the international HPC community in increasing yet variable energy cost and availability, and in tax regimes designed to encourage reduction in CO2 emissions, The long-term impacts will thus be on international HPC policy, and also in significantly improving the energy efficiency and environmental cost of HPC.
 
Description Recent advances in computational hardware, software and algorithms enable simulations of protein-ligand complexes to achieve timescales during which complete ligand binding and unbinding pathways can be observed. While observation of such events can promote understanding of binding and unbinding pathways, it does not alone provide information about the molecular drivers for protein-ligand association, nor guidance on how a ligand could be optimised to better bind to the protein. We have developed the WaterSwap (C. J. Woods et al., J. Chem. Phys., 2011, 134, 054114) absolute binding free energy method that calculates binding affinities by exchanging the ligand with an equivalent volume of water. A significant advantage of this method is that the binding free energy is calculated using a single reaction coordinate from a single simulation. This has enabled the development of new visualisations of binding affinities based on free energy decompositions to per-residue and per-water molecule components. These provide a clear picture of which protein-ligand interactions are strong, and which active site water molecules are stabilised or destabilised upon binding. Optimisation of the algorithms underlying the decomposition enables near-real-time visualisation, allowing these calculations to be used either to provide interactive feedback to a ligand designer, or to provide run-time analysis of protein-ligand molecular dynamics simulations.

Completed Project Outputs

21/10/11 Invited talk, "Understanding the emergence of drug resistance in influenza using GPU-accelerated molecular dynamics simulations", TargetMeeting 1st World Drug Discovery Online Conference, http://targetmeeting.com/Modules/Meetings/MeetingDetails.aspx?Id=16

2-4/5/12 Poster, "Aspire: Advanced Software for Adaptive Dynamics", "NAIS: State-of-the-art Algorithms for Molecular Dynamics", Edinburgh, http://www.nais.org.uk/MD2012/Home.php

29/5/12 Paper, "Long time scale GPU dynamics reveal the mechanism of drug resistance of the dual mutant I223R/H275Y neuraminidase from H1N1-2009 influenza virus", Biochemistry, doi: 10.1021/bi300561n

24-27/6/12 Poster plus demo, "HPC in the Cloud made easy: Giving every scientist the keys to cloud computing"

Poster, "How H1N1 'Swine' Flu virus develops drug resistance: Simulations using GPUs reveal how mutations disrupt drug binding", "CMS 2012", Cirencester, http://www.chm.bris.ac.uk/cms/index.html

3/7/12 Invited poster plus demo, "How H1N1 'Swine' Flu virus develops drug resistance: Simulations using GPUs reveal how mutations disrupt drug binding", "Emerald Grand Opening", RAL

16-18/10/12 Invited talk, "Using time-dependent binding free energy calculations to study the effects of mutation on drug binding and unbinding", TargetMeeting 2nd World Drug Discovery Online Conference, http://targetmeeting.com/Modules/Meetings/MeetingDetails.aspx?Id=45

25/10/12 Invited industrial seminar, "Using state-of-the-art computational chemistry to study the effects of mutation on drug binding and unbinding", UCB, Slough

2/1/13 BBSRC follow-on grant awarded, "Inquire: Software for real-time analysis of binding", BB/K016601/1

21/2/13 Invited seminar, "Introduction to Sire", Edinburgh

13/3/13 Invited talk, "The role of water in drug binding and unbinding", "Emerald Industry Day", RAL

25/3/13 Poster, "MGMS: Molecular modelling using cloud computing", London, http://comp.chem.nottingham.ac.uk/mmucc/

25-27/3/13 Poster, "The role of water in drug binding and unbinding: Direct observation of drug unbinding and rebinding in simulation", "CCP-Biosim Annual Conference", Nottingham, http://ccpbiosim.ac.uk/?q=annualconfs/conf2013

15/4/13 Invited industrial seminar, "The role of water in drug binding and unbinding", Cresset, Welwyn Garden City

24/7/13 Software used as part of the CCP5 2013 Summer School. Software was received well, and will likely be used at future CCP5 Summer Schools, http://www.ccp5.ac.uk/events/summer_school_2013

Other outputs

August 13 "Analysis and assay of oseltamivir-resistant mutants of influenza neuraminidase via direct observation of drug unbinding and rebinding in simulation" (Biochemistry), and "Computational Assay of H7N9 Neuraminidase reveals a Potentially Drug Resistant Strain" (Scientific Reports)

August 13 Published the Acquire software with documentation and unit tests. Submited for validation by the SSI.

3-5/9/13 Invited talk, "Detecting drug resistant mutants using waterswap absolute binding free energy calculations", "DrugDesign2013", Oxford, http://lpmhealthcare.com/Drugs2013/Agenda.htm

Sept. 13 Submit paper "Methods in Modern Molecular Monte Carlo" (collaboration with J. Michel at Edinburgh)

Autumn 13 Software used as part of undergraduate teaching at the University of Cardiff.

6/12/13 "Rapid Decomposition and Visualisation of Protein-Ligand Binding Free Energies by Residue and by Water", for Faraday Discussion #169 - Molecular Simulations and Visualization,
Exploitation Route These tools are being applied and tested by academic and industrial partners through CCP-BioSim (see ccpbiosim.ac.uk) and HECBioSim (see HECBioSim.ac.uk) and in CCP5 (ccp5.ac.uk) Sire: An advanced, multiscale, molecular simulation framework
Sire is written to allow computational modelers to quickly prototype and develop new algorithms for molecular simulation and molecular design.
Sire is written as a collection of libraries, each of which contains self-contained and robust C++/Python building blocks. These building blocks are vectorised and thread-aware and can be streamed (saved/loaded) to and from a version-controlled and tagged binary format, thereby allowing them to be combined together easily to build custom multi-processor molecular simulation applications.
Sectors Chemicals,Pharmaceuticals and Medical Biotechnology

URL http://www.siremol.org
 
Description Application of methods in the pharmaceutical industry Sire: An advanced, multiscale, molecular simulation framework Sire is written to allow computational modelers to quickly prototype and develop new algorithms for molecular simulation and molecular design. Sire is written as a collection of libraries, each of which contains self-contained and robust C++/Python building blocks. These building blocks are vectorised and thread-aware and can be streamed (saved/loaded) to and from a version-controlled and tagged binary format, thereby allowing them to be combined together easily to build custom multi-processor molecular simulation applications.
First Year Of Impact 2012
Sector Chemicals,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description BBSRC Tools and Techniques: Computational tools for enzyme engineering: bridging the gap between enzymologists and expert simulation
Amount £146,027 (GBP)
Funding ID BB/L018756/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 07/2014 
End 01/2016
 
Title Acquire job scheduler 
Description Demo release of the Acquire job scheduler. The prototype version of the software developed to ease submission and management of jobs to distributed HPC clusters. 
Type Of Technology Grid Application 
Year Produced 2012 
Impact Demo release of the Acquire job scheduler. The prototype version of the software developed to ease submission and management of jobs to distributed HPC clusters. 
URL http://ssi-amrmmhd.epcc.ed.ac.uk/conspire/The_HPC_Cloud.html
 
Title Acquire job scheduler 
Description Production release of the Acquire job scheduler. The production version of the software developed to ease submission and management of jobs to distributed HPC clusters. 
Type Of Technology Grid Application 
Year Produced 2013 
Impact Used as part of CCP5 training workshops in Manchester in 2013 and 2014. 
URL http://ssi-amrmmhd.epcc.ed.ac.uk/conspire/The_HPC_Cloud.html
 
Title FESetup 
Description FESetup FESetup is a tool to automate the setup of (relative) alchemical free energy simulations like thermodynamic integration (TI) and free energy perturbation (FEP) as well as post-processing methods like MM-PBSA and LIE. FESetup can also be used for general simulation setup ("equilibration") through an abstract MD engine. The latest releases are available from the project web page. 
Type Of Technology Software 
Year Produced 2017 
Impact FESetup FESetup is a tool to automate the setup of (relative) alchemical free energy simulations like thermodynamic integration (TI) and free energy perturbation (FEP) as well as post-processing methods like MM-PBSA and LIE. FESetup can also be used for general simulation setup ("equilibration") through an abstract MD engine. The latest releases are available from the project web page. 
 
Title Sire 2012.1 
Description 2012.1 release of Sire molecular simulation framework. Main enhancement was the creation of code to simplify the running of waterswap absolute binding free energy calculations. This included new algorithms for automatically generating z-matrices of molecules, automatically loading the molecular structure and parameters from Amber format topology/coordinate files, and the development and inclusion of new algorithms for speeding up the calculation (reflection sphere, grid-based electrostatics etc.) 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact This version of the code was used to run some of the simulations in "Computational Assay of H7N9 Influenza Neuraminidase Reveals R292K Mutation Reduces Drug Binding Affinity" Scientific Reports, doi:10.1038/srep03561 The emergence of a novel H7N9 avian influenza that infects humans is a serious cause for concern. Of the genome sequences of H7N9 neuraminidase available, one contains a substitution of arginine to lysine at position 292, suggesting a potential for reduced drug binding efficacy. We have performed molecular dynamics simulations of oseltamivir, zanamivir and peramivir bound to H7N9, H7N9-R292K, and a structurally related H11N9 neuraminidase. They show that H7N9 neuraminidase is structurally homologous to H11N9, binding the drugs in identical modes. The simulations reveal that the R292K mutation disrupts drug binding in H7N9 in a comparable manner to that observed experimentally for H11N9-R292K. Absolute binding free energy calculations with the WaterSwap method confirm a reduction in binding affinity. This indicates that the efficacy of antiviral drugs against H7N9-R292K will be reduced. Simulations can assist in predicting disruption of binding caused by mutations in neuraminidase, thereby providing a computational 'assay.' 
URL http://www.siremol.org/Sire/Home.html