CHEMIFY: A System to Produce Universal Digital Chemical Synthesis

Lead Research Organisation: University of Glasgow
Department Name: School of Chemistry


The aim of this proposal is to establish a standard digital code for the synthesis of molecules. Like Spotify, which allows the distribution of music in an mp3 (or similar) digital format, the development of a chemical code for synthesis will allow users to share their code as a result of the digitisation 'Chemify' process. The code will be demonstrated both manually and on basic robotic systems available in our laboratory (GU) and with our international collaborators based in the USA (MB), Canada (AAG), Germany (PS), and Poland (BG) who are experts in modular organic scaffold synthesis (MB), computational chemistry and statistics for experimental design (AAG), robotic carbohydrate synthesis (PS), and networks and rules of chemical synthesis (BG). In the long term, the ability to automate the synthesis of molecules will lower the cost of manufacture by enabling the automatic and unbiased exploration of chemical space giving a digital code. Such codes are needed if chemists are to develop systems that ensure reproducibility, and the ability to explore new reactions and statistics driven design of experiments to target unknown molecules. Recently we took a key step to encoding a multi-step synthesis into a digital blueprint,1 but the vision to go from code to molecules represents a gigantic problem. In this proposal, we will aim to develop a chemical ontology for synthetic chemistry that will lead to the first version of a programming language for chemical synthesis. We will then demonstrate the code can be used to synthesise important molecules, already robotically synthesised by us, and examples from our collaborators in the USA, Germany, Canada and Poland on the same universal 'chemputer' synthesise robot.

Planned Impact

The aim of this proposal is to standardise digital code for the synthesis of molecules. This centre-to-centre research will allow chemical synthesis to be enabled more quickly, in a distributed way due to the invention of the universal chemical code. The developed code will be like Spotify for chemistry since the development of a chemical code for synthesis will allow users to share their code as a result of the digitisation 'Chemify' process. The code will then be used to synthesise important molecules, already robotically synthesised by us, and with examples from our collaborators in the USA, Germany, Canada and Poland, on the same universal synthesis robot.

Developing a chemical ontology is the first step towards the digitization of chemistry, that has the potential to transform our economy. To take one example, the UK pharmaceutical market is set to grow from $32B in 2016 to $43B by 2020, with around 5,000 UK companies in the area. In this proposal, we will interact with key companies (Deep Matter and GSK) to ensure that the fundamental development of the code-to-molecule approach is tied to the practical challenges faced by the chemical industries in our current economy.

There is a broad realisation that digitization must impact chemistry, but until now the avenue to achieve this has been poorly defined. This centre-to-centre research collaboration will change the way discovery, development and manufacture of chemical products will be performed in the future as well as fundamentally changing the way chemistry is done in the laboratory. Today chemistry suffers from issues regarding reproducibility, safety, environmental impact and the cost of discovery, translation to manufacture and the timescales for customisation. This project will define a new digital standard for chemical synthesis, discovery, and collaboration and directly addresses several UK strategic priorities under the Government's 8 Great Technologies including: big data, robotics and autonomous systems; advanced materials, and also aligns to ALL the EPSRC priorities.

A key impact of the grant will be the training of a new generation of scientists who work naturally within a heavily collaborative environment, developing skills that are essential for working in a strong team with a broad science base. The current program aims to be building it into the structure of the Management Plan where the diverse nature of the research and the need for strong interaction across the work packages, each team member will receive training in, and exposure to, a range of research being carried out within the project. This will enhance training, flexibility and options for redeployment but also well-rounded researchers ready to make future impacts. Participants will develop skills across several disciplines: chemistry, computing science, physics, engineering and mathematics, combining to make this a strong project for enabling the digital chemistry vision. The PDRAs will be trained to think across these disciplinary boundaries and be exposed to KE activities, hence providing highly skilled scientists to work in the emerging digital chemistry sector. The science in this project will offer an excellent opportunity for engaging the general public across all age groups.
Description We have developed a standard digital code for the synthesis of molecules called XDL (Chi Dee El). This simple abstraction allows those with no experience in coding to program the synthesis of molecules including steps for Add Reagent, Stir, Heat and Filter. We have made the base code on which this is built open source on GitLab and have created a website where others can create XDL files for their synthesis procedures or translate literature procedures into XDL. This work was recently published in Science. At the same time a graph module allows for the incorporation of the hardware modules with the code such that one can determine which modules are required for a synthesis and the XDL script will flag if additional modules are required to complete the synthesis.

We have demonstrated this XDL code can be utilised on our robotic systems. Utilisation on other synthesis platforms such as Chemspeed is currently underway with our collaborators but has been hampered by COVID.
Exploitation Route The outcomes will allow all researchers working with automated synthesisers to program their synthesis using to a common code which importantly will not require coding expertise. These scripts can then be shared which will allow full reproducibility of these procedures since all of the synthetic parameters are captured in the XDL.
Sectors Chemicals,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

Description RSC Digital Futures 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact In September 2019 the Royal Society of Chemistry set out to gain a more in-depth understanding of the long-term promise of and concerns about the use of data and digital technologies for scientific discovery by inviting 14 experts from different scientific fields and sectors to its first Strategic Advisory Forum. This forum brought together leaders from the natural sciences and digital fields to set out a vision for how digital technologies - from computational chemistry and multiscale modelling to machine learning and robotics - will enable and accelerate scientific discovery and solutions to global challenges.

I provided advice and insight on the topic to this forum.
Year(s) Of Engagement Activity 2019