📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Machine Learning and Molecular Modelling: A Synergistic Approach to Rapid Reactivity Prediction

Lead Research Organisation: University of Bath
Department Name: Chemistry

Abstract

The computational design of new chemical reactions is regarded as one of the "Holy Grails" of computational organic chemistry and biochemistry. Accurate and fast computational approaches to predicting chemical reactivity would provide cost-effective alternatives to time-consuming experimental approaches, and in some cases animal testing, in drug design, toxicology and chemical synthesis. Of great importance are mechanism-based prediction models because they are much more likely to reach general acceptance compared to computational "black-box" models which offer no insight into how and why predictions are made. Providing such insight is especially important for models to gain regulatory acceptance in toxicology and drug design. However, no current computational approach (molecular modelling or machine learning (ML)) to reactivity prediction offers the combination of fast, accurate predictions with clear mechanistic insight; one or more of these desirable characteristics must be sacrificed in pursuit of the others. This project will develop a novel, synergistic molecular modelling and ML approach to rapid, high-accuracy and mechanism-based reactivity prediction for use in toxicology, drug design and chemical synthesis and thus help realise the "Holy Grail".

We will train and validate ML models on large datasets (~10,000 compounds) that can correct energy barriers obtained from rapid molecular modelling techniques to those derived from prohibitively slow, high-accuracy methods. Our synergistic approach to reaction modelling will thus be to derive mechanistic insight from these rapid molecular modelling techniques and use our ML models to obtain fast and accurate reaction barriers. Models for C-N bond-forming reactions will be developed for use in covalent drug design (targeting lysine), toxicology (predicting mutagenicity and respiratory sensitisation) and pharmaceutical drug synthesis planning. To demonstrate the broad utility of our synergistic approach, we will use it to rationalise experimental reactivity data of biologically and synthetically relevant systems for which the use of current modelling approaches would be prohibitively slow. Rather than requiring a supercomputer, predictions will be possible even on a laptop which will represent a paradigm shift in reaction modelling.
 
Description This award focused on the development of reaction modelling methodology that is fast, accurate, and provides mechanistic insight from transition state (TS) geometries. Such computational tools enable the rational design of new reactions thus helping to reduce the need for trial-and-error experimentation in chemical synthesis and drug discovery.

We have reported a hybrid machine learning (ML)-modelling approach which meets all of the above criteria for reaction barrier prediction and thus serves as a high-throughput method for rapid reactivity design. In this work, we trained ML models that corrected rapid but approximate semi-empirical quantum mechanical (SQM) reaction barriers to DFT-quality barriers for reactions of importance to the pharmaceutical industry. This approach enabled barrier predictions to be made in seconds with errors below the chemical accuracy threshold of 1 kcal mol-1. Rapid mechanistic insight was available from the SQM TSs which were found to be good approximations of the DFT geometries. We also showed that the amount of expensive data needed to develop these reactivity prediction models can be reduced from thousands to just a few tens of data points which was particularly important when working with large and complex systems of high synthetic importance.
Exploitation Route We have published our results open access to reach the broadest possible audience. It is hoped that our methods will be used widely within the academic and industrial chemical research communities to aid in reactivity design and thus reduce trial-and-error experimentation. It is likely that our low data approaches will be of particular importance because they offer the cheapest route to the routine use of computational methods in the design of experimental work.
Sectors Chemicals

Digital/Communication/Information Technologies (including Software)

Pharmaceuticals and Medical Biotechnology

 
Description The award focused on developing computational tools to reduce the need for trial-and-error experimentation in chemical synthesis and drug discovery. The work has begun to influence the UK pharmaceutical industry; this is best evidenced by the recruitment of a team member into a job in this industry as a result of the skills and knowledge they developed during the award.
First Year Of Impact 2024
Sector Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description Interdisciplinary Computer Aided Synthesis (ICAS) Symposium 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact We organised a two-day symposium for interdisciplinary computer-aided synthesis (ICAS) in September 2023 in Bath, hosting 23 synthetic and computational chemists from academic and industrial institutions across the UK. The aim of this symposium was to bring together chemists from several different disciplines and backgrounds to facilitate discussion, generate ideas for academic-industrial collaboration, and to understand different perspectives on computer-aided synthesis. During this time, the work associated with the grant was presented and discussed, as well as many other related pieces of work from other academic and industrial institutions. The event was an overall success and received overwhelmingly positive feedback from participants. The research insights obtained from the event regarding this research project have allowed us to understand more about the impact of our work, as well as the best and most efficient way to achieve it. It is also very beneficial to understand where others institutions may be working towards common goals and to educate others on what we are trying to achieve and how that could potentially be useful for them. Finally, the event has also allowed us to expand our scientific network, affording further opportunities to develop our research and collaroate with others in the future.
Year(s) Of Engagement Activity 2023