Targeted Design of Small Molecules using Advanced Machine Learning Approaches

Lead Research Organisation: University of Oxford
Department Name: Medical Sciences DTC

Abstract

De novo therapeutic design aims to generate new molecules, or enhance existing molecules, with desirable properties. Traditionally, this process is carried out by medicinal chemists, who leverage their knowledge of a given target's structure to design a molecule with high target binding affinity, off-target selectivity and low synthetic cost, among many other requirements. However, these properties often directly compete, making the generation of novel drugs a costly and time-intensive process - a recent study of the research and development processes placed the median cost of developing a drug at 985 million USD, and found the average time required to be just over 8 years before clinical trials could begin.
Machine learning (ML) algorithms, particularly deep neural networks, present a promising alternative to traditional molecule design techniques, and aim to reduce the price and time demands of drug manufacturing. To apply machine learning to molecule design, molecular data must first be encoded into a readable format. A range of different approaches have been used for this; some rely on low-dimensional representations of molecules, which reduces computational demand and allows the use of natural language processing models. More recently, however, several studies have shown success employing a structure-based approach, incorporating 3D information about the target or known-actives (molecules known to bind to the target) to design candidate molecules with complementary structures to the binding site in question.
This project, which is partially funded by IBM Research, falls within the EPSRC artificial intelligence and robotics research area. The main project objective is to contribute to the growing body of research surrounding computer aided drug design. Practically, this contribution could take many forms, but the initial aim will be to further develop an existing deep generative model capable of incorporating 3D structural information about the target of choice to produce candidate drug molecules. Once generated, structure-based virtual screening methods will be used to assess the quality of the candidates produced and hence also the model.
Practically speaking, I will initially be using a model recently published by IBM research. In this model, the target's structural information is encoded through first representing the 3D structure by voxels (units of graphic information that define a point in three-dimensional space) of the secondary structure element (SSE) density. This approach ensures the structural information of the protein is preserved in a scale-free manner.
This project falls within the EPSRC artificial intelligence and robotics research area.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/W522211/1 01/10/2021 30/09/2027
2599699 Studentship EP/W522211/1 01/10/2021 30/09/2025 Lucy Vost