Supporting the OpenMM Community-led Development of Next-Generation Condensed Matter Modelling Software

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Chemistry

Abstract

Atomistic simulations are the main application of high-performance computing research, and increasingly underpin innovative R&D processes in the chemical and life sciences industry. OpenMM is the fastest growing atomistic simulation engine among the current ecosystem of open-source academic software. Originally targeting a biomolecular simulation audience, the OpenMM user base is growing exponentially and has permeated diverse related domains, including materials modelling, quantum chemistry, structural bioinformatics, chemoinformatics, artificial intelligence and machine learning. The success of OpenMM is down to a design that achieves an excellent tradeoff between extensibility (via a robust user interface) and performance on GPUs (via auto generated CUDA kernels) for molecular dynamics (MD) simulations. OpenMM is used standalone or via plugins to other atomistic simulation engines, providing access to GPU-accelerated MD simulation capabilities for the whole atomistic simulation ecosystem.

We have surveyed the OpenMM user community to identify its most pressing needs.

OpenMM is currently maintained by a single core developer who can no longer support the training and support needs of its rapidly growing user community. We will transition OpenMM to a more sustainable community-driven development model. We will develop training resources to upskill users, and engage the community to widen participation in developing and maintaining OpenMM functionality.

Machine learning (ML) potentials have the potential to revolutionise the future of atomistic simulation methodologies. Our community survey has identified strong interest in ANI neural network and GAP Gaussian process regression methods. We will deliver a self-contained GPU-optimised GAP implementation in OpenMM and coordinate with project partners working on an OpenMM ANI implementation to offer the community a library of ML potentials that can be readily plugged into existing simulation engines.

OpenMM must adapt to scientific (ML potentials) and technological (increased hardware heterogeneity) drivers to continue offering its user base an optimised tradeoff between speed and ease of modification over the coming decade. We will integrate in OpenMM a multiple level intermediate representation compiler (MLIR) to auto generate from user-specified Python instructions optimised low-level code targeting diverse hardware. By enabling users to specify custom atomic featurisation techniques as OpenMM operations, which can be finely interleaved with Tensorflow or Pytorch operations, we will position OpenMM as the simulation engine of choice to support deployment of next generation ML potentials onto current GPUs and emerging AI-hardware accelerators.

Our community has also required support to facilitate the combined use of independently developed OpenMM software solutions with other software from the broader atomistic simulation ecosystem. This research will develop a standardised interface to integrate OpenMM community software with CCPBioSim's interoperable Python framework BioSimSpace. We will demonstrate integration of all the work packages of this research via production of GAP ML pipelines for two use cases that target grand challenges in soft-condensed matter modeling (organocatalysis - recently recognised by the 2021 Nobel Prize in Chemistry- and protein-ligand binding).

Altogether this research will position the OpenMM user community at the forefront of next-generation hybrid machine learning/molecular mechanics potentials for soft-condensed matter modelling. Deeper integrations with AI and HPC communities will pave the way for atomistic simulations to harness emerging exascale opportunities. Transitioning from a single developer to a community-driven development governance model will improve sustainability of the codebase and encourage greater adoption of OpenMM in associated academic communities and industry.

Publications

10 25 50