Materials and Molecular Modelling Exascale Design and Development Working Group

Lead Research Organisation: University College London
Department Name: Chemistry

Abstract

High Performance Computers (HPC), or supercomputers, offer exciting opportunities in understanding, developing and increasingly predicting the properties of complex materials through atomistic and electronic structure modelling; and the scope and power of our computational techniques continue to expand as the capability of the hardware grows. The advent of exascale systems is the next dramatic step in this evolution. There is a high cost of both purchasing and running such a system, so it is imperative that appropriate software is developed before users gain access to exascale facilities. The investigators of this project are internationally leading experts in developing (enabling new science) and optimising (making simulations more efficient) state-of-the-art materials software for running simulations on HPC, based here and abroad. Software that we have developed is used both in academia and in industry. Currently, our community consumes over a third of the UK's HPC facility (ARCHER) of approximately 120,000 cores (with a peak performance of 2.5x10^15 Flop/s) and we do not anticipate any delays in immediately getting the most out of the successor to ARCHER, which will be composed of approximately 750,000 cores (estimated at ~ 28x10^15 Flop/s) when it becomes available later this year to the UK academic community. Exascale computers (by definition >10^18 Flop/s) will be composed of many more cores. However, it is anticipated that the next generations of national computers will not provide a smooth transition from the existing infrastructures, but instead will undergo a step change for the UK national facilities, with a shift from conventional CPU based architectures to CPUs hosting (multiple) many-core accelerators. Many, if not all, of our software packages will require major changes before these architectures can be fully exploited. Appropriate changes to the software will effect a reduction in data exchange between cores, management of communications between CPU and accelerators, and, moreover, adaptation in our procedures for handling input and output data.
As requested in the Call, we will form a design and development working group (DDWG) by bringing together Research Software Engineers (RSEs) and experts from mathematics and computer science with a wide range of domain experts in Materials and Molecular Modelling (MMM). We represent a very large and important community which has long established mechanisms for development, resource management, and disseminating best practices that the DDWG will exploit. To maximise the impact of the available UK funds for exascale computing, we identify solutions that will benefit most of our community.

Our DDWG aims to separate out the fundamental mathematics of the problem from the computer science of implementation. We will exploit best current practices and those under development in our domain and across other disciplines in particular targeting libraries that can be called by many materials software and offer a route to heterogeneous architectures. In a complementary development, we will tackle new workflows to manage and analyse vast volumes of simulation data. We will gain valuable experience from our regular meetings and meetings with other DDWGs; and knowledge transfer with our national and international project partners. Undertaking of the initial work and identifying what is required (work earmarked as part of the next funding stage) will enrich our expertise and facilitate international collaborations with developers of materials software and users of overseas exascale computers.

This work enables the UK MMM community to use exascale HPC resources efficiently to address many EPSRC Grand Challenges, including Emergence and Nanoscale Design of Functional Materials (Physics); Dial a Molecule and Directed Assembly of Extended Structures with Targeted Properties (Chemical Sciences); and Engineering From Atoms to Applications (Engineering).

Planned Impact

The dramatic increase in research capabilities brought in by the onset of exascale computing in the fields of Materials and Molecular Modelling will propagate on practically every area of economics and daily life of ordinary people in the UK and across the world.

Materials performance underpins a large number of industrial processes, which are instrumental in maintaining global wealth and health, as well as playing a key role in developing processes that are both environmentally and economically sustainable. Our community has always led the early use of latest HEC hardware and our successful exploitation of Exascale Hardware will have an impact on: the industrial sector, including chemicals, pharmaceuticals, biomaterials, energy, and electronics industries; on society more generally; and on academic communities in chemistry, physics, materials and life and earth sciences, and computational science. Bringing together our community along with experts outside of our domain will ensure the continuing leadership of UK science in a strongly competitive field.

The specific areas of impact will be:

(i) Industry, where modelling and simulation are now integral tools in the design and optimisation of materials. Simulations made possible via the exploitation of exascale hardware will have direct relevance to industries and our members have active collaborations with several UK industries, including Johnson Matthey, AstraZeneca, Glaxo Smith Kline, Pfizer, Bristol-Myers Squibb, Process Systems Enterprise Limited, Britest Limited, Perceptive Engineering Ltd and BP. These industrial links enable the project to contribute to the long-term, continuing competitiveness of the UK economy.

(ii) The General Public and policy makers to whom the work of our community will be communicated by the MMM Hub, MCC and UKCP websites, and CCP9 and a variety of outreach events with which we will promote the key role of materials developments and computational modelling in areas of general interest to the public including energy technologies and policy.

(iii) Academic Groups - both experimental and computational - where the extensive network of our community will ensure the effective dissemination of its results with much of the work feeding into other projects. The software developed will be of wide benefit to both academic and industrial users

(iv) RSE Staff - This is a high-profile, community effort in which RSE staff play a pivotal role. Engagement with this project will raise the profile of the RSE staff, promote greater engagement with the UK and international materials modelling communities, and offer many opportunities to gain experience with cutting-edge high-performance computing techniques in a research context, significantly enhancing the RSEs' career opportunities. Beyond the immediate RSE staff, this project further develops the UK RSE skill base to support exascale computing, ensuring UK researchers have the support they need to exploit exascale systems worldwide. The leading-edge skills and techniques employed in this project serve as exemplars to the HPC community, and will also lead to a "trickle down" effect, bringing significant improvements to smaller scale Tier 2 and Tier 3 HPC projects. In addition to domain-specific conferences, the work will be highlighted to the RSE community via the RSE Slack Channel and presented at the UK RSE Conference, as well as HPC conferences (e.g. Supercomputing and PASC).

Publications

10 25 50
 
Description This project assembled a large community of experts in computational modelling of materials and molecules, to determine what our needs were for the next generation of supercomputers ("exascale computers") and what science such machines would enable. Seven key scientific approaches and applications were identified, from extremely accurate, computationally demanding quantum mechanical simulations, to more approximate, faster methods using parameterised interatomic forces or machine-learning. We also examined the challenges our various modelling programs will face when used on exascale computers, and how to enhance and re-engineer the software to overcome the challenges and take full advantage of exascale machines.
Exploitation Route The primary goals were to identify the scientific needs for exascale high-performance computing (HPC), and the software challenges faced in exploiting exascale HPC. The scientific case could be used to help build a scientific and business case for an exascale facility, e.g. to HM Govt. The identified challenges are directly useful for research software engineers and scientific software developers, as well as computer hardware vendors - particularly as we seek to co-design hardware/software solutions for the future.
Sectors Aerospace, Defence and Marine,Chemicals,Construction,Digital/Communication/Information Technologies (including Software),Education,Electronics,Energy,Environment,Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Transport

URL http://mmmhub.ac.uk/mmm-themes
 
Description NVIDIA 
Organisation NVIDIA
Country Global 
Sector Private 
PI Contribution Developing a GPU port of the CASTEP code
Collaborator Contribution 2 GPU-based compute cards, compilers, access to NVIDIA's development cluster and expert engineering advice
Impact On-going work
Start Year 2018
 
Title Atomistic simulation Python API 
Description A generic API to provide an abstract interface to atomistic simulation programs, without needing to know the implementation details of the simulation programs themselves. This enables high-level software frameworks and workflows to switch easily between different simulation packages (for example, to compute forces); similarly, developers of simulation software implement this API in their programs, and immediately have their software be usable in the high-level tools. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact An initial implementation of this API has been developed in CASTEP, which has already been used to develop advanced transition-state search methods. 
URL https://bitbucket.org/byornski/dft-python-api/
 
Title CASTEP 22 
Description CASTEP is a software package for predictive, quantum-mechanical simulations of materials and chemicals. It is based on density functional theory, and can simulate a wide range of materials proprieties including energetics, the structure at the atomic level, vibrational properties and electronic response properties. In particular, it has a wide range of spectroscopic features that link directly to experiment, such as infra-red and Raman spectroscopies, NMR, and core level spectra. CASTEP version 22 included a top-level Python layer to enable CASTEP to be embedded within other computational workflows, for example transition-state searches or multiscale modelling. 
Type Of Technology Software 
Year Produced 2021 
Impact CASTEP is used by around 1000 companies and research groups around the world. The key papers describing CASTEP are cited over 1000/year and CASTEP is cited in support of over 250 patents. CASTEP is available under a free-of-charge licence to academia worldwide, and marketed commercially worldwide by Dassault Systemes. 
URL http://www.castep.org
 
Title Py-ChemShell third beta release (v21.0) 
Description Py-ChemShell is the python-based version of the ChemShell multiscale computational chemistry environment, a leading package for combined quantum mechanical/molecular mechanical simulations. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact The third beta release of Py-ChemShell was the first release to support automated import of biomolecular forcefields (CHARMM and AMBER) for QM/MM calculations, and features a new integrated workflow for setup of biomolecular systems. This is a major milestone for users in the biomolecular modelling community to transition from the original Tcl-based version of the software. It also features periodic QM/MM embedding for surface-adsorbate systems developed under the "SAINT" project. 
URL https://www.chemshell.org
 
Description CCP5 Summer School 2021 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Introduction to Modern Fortran, 2 days event 6 lectures and 6 practicals 40 students attended.
Year(s) Of Engagement Activity 2021
URL http://summer2021.ccp5.ac.uk
 
Description DDWG meeting with Intel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Bring development activities between UK's Materials and Molecular Modelling academic community and Intel closer

Discussions followed a talk by Intel on OneAPI: A Unified, Standards-Based Programming Model

Abstract: Modern workload diversity necessitates the need for architectural diversity; no single architecture is best for every workload. XPUs, including CPUs, GPUs, FPGAs, and other accelerators, are required to extract high performance. Intel® oneAPI products will deliver the tools needed to deploy applications and solutions across these architectures. Its set of complementary toolkits-a base kit and specialty add-ons-simplify programming and help developers improve efficiency and innovation. The core Intel® oneAPI DPC++/C++ Compiler and libraries implement the oneAPI industry specifications available at oneapi.com.
Year(s) Of Engagement Activity 2020
URL https://mmmhub.ac.uk/2021/01/11/intel-oneapi-a-unified-standards-based-programming-model/
 
Description Invited talk at 3rd EMMC (European Materials Modelling Council) International Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The non-profit Association, EMMC ASBL, was created in 2019 to ensure continuity, growth and sustainability of EMMC activities for all stakeholders including modellers, materials data scientists, software owners, translators and manufacturers in Europe.
The EMMC considers the integration of materials modelling and digitalisation critical for more agile and sustainable product development.
Year(s) Of Engagement Activity 2021
URL https://emmc.eu/
 
Description Monthly meetings of the wider MMM DDWG Project Partners 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Monthly meetings to establish future strategy of Exascale requirements in the UK in the area of materials and molecular modelling research
Year(s) Of Engagement Activity 2020
URL https://mmmhub.ac.uk/excalibur-project-partners/
 
Description Presentations on Exascale Challenges at the Materials and Molecular Modelling Exascale Design and Development Working Group Kick-off Workshop, May 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact CoSeC representatives Ian Bush, Thomas Keal, and Alin Elena gave talks on Exascale Challenges at the MMM DDWG kick-off workshop: "Large Single Calculations - The Scaling Out Challenge" (IB), "Complex Workflows Challenge" (TK) and "I/O Exascale Challenge" (AE). This helped shape the discussions of the working group and scope out the work for the year ahead.
Year(s) Of Engagement Activity 2020
URL http://mmmhub.ac.uk/about-excalibur/
 
Description Webform Survey to UK's MMM community 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Outreach questionnaire activity to the UK Materials and Molecular Modelling community. The invitation sent out to members of MCC, UKCP, CCP5, CCP9, TYC and the MMM (the estimated number engaging in this activity is based on the number of responses received). We wanted to know what type of simulation a member of this community would run if they had the current national high performance computer to themselves (or a resources that was ten times the size of UK's tier-1 HPC facility), any technical barriers that currently prevents running such large scale simulations, and what the impact would be achieved if these simulations were successfully completed. Data collected helped steer the project (focus of our effort with respect to ensuring the MMM community is ready to exploit exascale HPC if and when such facilities become available).
Year(s) Of Engagement Activity 2020
 
Description Weekly steering committee meetings 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Weekly hour-long meetings of the steering committee (Design & Development Working Group) to plan strategic activity of the Exascale grant
Year(s) Of Engagement Activity 2020
URL https://mmmhub.ac.uk/excalibur-ddwg/