Towards Transferable Machine Learning Interatomic Potentials for Reactive Organic Chemistry in Solution

Lead Research Organisation: University of Cambridge
Department Name: Chemistry

Abstract

Reactions in liquid phase are central to research and industry involving synthetic organic chemistry. Solvents influence reaction mechanism and rate in complicated ways involving non-covalent interactions and proton transfers. Solvent effects are not yet completely understood. Computer simulation can aid the understanding of solvent effects through free energy calculations and treatment of the full system at a high level of theory. This is commonly done using ab initio molecular dynamics (AIMD), which has a limited scope of application due to high computational cost. Machine learning interatomic potentials (MLIPs) are becoming a common tool for accurate condensed phase simulation, extending the time- and length-scale accessible to simulation without significant loss of accuracy compared with AIMD. The key challenge to developing an MLIP is the efficient sampling of relevant PES. There have been attempts at building a general ML potential as well as attempts at creating MLIPs for a specific condensed phase reaction. The high flexibility of MLIPs opens the door to FFs that accurately represent a large fraction of the chemistry spanned by a selection of chemical elements. There has not been attempt to build a general reactive MLIP for studying organic reactions in solution. We will develop a strategy to efficiently sample configurations that represent the PES of a chosen solution-phase reactive system. We will then generalise this approach to build a general-purpose MLIP that achieves reasonable accuracy across different reactant-solvent systems and does not need to be re-trained for each new application. Finally, we hope to use our models to carry out calculations of reactive systems relevant to modern organic chemistry research.

Planned Impact

Who might benefit from this research? How might they benefit from this research?

Students
(a) The major beneficiaries of the CDT will, of course, be the students that train on the program. They will be equipped with a set of skills that will be highly desirable in the organic molecule making industries. Although the proposal is directing towards a need in the pharmaceutical industry, the training and research skills are totally transferable to industries like the argochemical sector (this is an almost seamless transition as the nature of the needs are near identical to that of pharma) but also the fine chemicals industries, CRO's who serve all of these industries. With some adaptation of the skills accrued then the students will also be able to apply their knowledge to problems in the materials industries, like polymers, organic electronics and chemical biology.

(b) Synthesis will also be evolving in academia and students equipped with skills in digital molecular technologies will be at a significant advantage in being apply to implement the skills acquired while training on the CDT. These students could be the rising stars of academia in 10 years time.

(c) The non-research based training will benefit the students by providing a set of transferable skills that will see them thrive in any chosen career.

(d) The industry contacts that will be generated from the variety of interactions planned in the CDT will give students both experience and insight into the machinations of the industrial sector, helping them to gain a different training experience (form industry taught courses) and hands on experience in industrial laboratories.

(e) All student in UCAM will be able to benefit in some way form the CDT. Training courses will not be restricted to CDT students (only courses that require payment will be CDT only, and even then, we will endeavour to make additional places available for non-CDT students). The overall standard of training for all students wil be raised by a CDT, meaning that benefit will be realised across the students of the associated departments. In additional, non CDT students can also be inspired by the research of the CDT and can immerse new techniques into their own groups.

Academic researchers in related fields (PIs)
(a) new research knowledge that results from this program will benefit PIs in UCAM and across the academic community. All research will be pre-competitive, with any commercial interests managed by Cambridge Enterprise

(b) a change in mnidset of how synthetic research is carried out

(c) new collaborations will be generated withing UCAM, but also externally on a national and international level.

(d) better, more closely aligned, interactions with industry as a result of knowledge transfer

(e) access to outstanding students

Broader public
(a) in principle, more potential medicines could be made available by the research of this CDT.

Economy
(a) a new highly skilled workforce literate in disciplines essential to industry needs will be available
(b) higher productivity in industry, faster access to new medicines
(c) spin out opportunities will create jobs and will stimulate the economy
(d) automation will not remove the need for skilled people, it will allow the researchers to think of solutions to the problems we dont yet understand leading to us being able to discover solutions faster

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024220/1 01/06/2019 30/11/2027
2751535 Studentship EP/S024220/1 01/10/2022 30/09/2026 Domantas Kuryla