📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

The representation, and effect, of chemical context on the accuracy of machine learning chemical reaction prediction models

Lead Research Organisation: University of Cambridge
Department Name: Chemistry

Abstract

Prediction of chemical reaction yields using machine learning models is an active research area. Current state of the art yield prediction models perform well when trained on HTE data, but have unsatisfactory performance when trained on literature data. These models use reaction representations which only convey information about the reactants and products of a reaction. The effect of including additional reaction information on the accuracy of the yield prediction is unknown. It is proposed including this information in the form of an ontology could improve the yield prediction accuracy. Thus far, reaction and physical property data extraction from Reaxys and the DDB, NLP of associated Reaxys text, and design of a reaction ontology have been completed. The extracted data is used to automatically populate ontologies, with each reaction populating its own ontology to serve as a new reaction representation. The ontology structure is generic enough to accommodate any reaction type, and is designed to align with related ontologies describing reactor systems. Future tasks in this project are the embedding of ontologies to create ML ready inputs, and the training and evaluation of a transformer model using the ontology embeddings for yield prediction.

People

ORCID iD

Michael Zhou (Student)

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024220/1 31/05/2019 30/11/2027
2895024 Studentship EP/S024220/1 30/09/2023 29/09/2027 Michael Zhou