Geometric Deep Learning for Binding Affinity Prediction

Lead Research Organisation: University of Oxford
Department Name: Sustain Approach to Biomedical Sci CDT

Abstract

Drug discovery is an incredibly expensive and time-consuming process. The average cost of bringing a drug to market is $1.3 billion, which is doubling every nine years, and takes ten to fifteen years. The development of machine and deep learning techniques have promised to improve the efficiency of the drug discovery process and arrest this decline in productivity. Structure-based drug discovery, one method of discovery, uses computational methods and the 3D structure of a protein to identify novel drugs that bind to the target. Accurate scoring functions that can predict the binding affinity between a protein and a small molecule drug have been developed using machine learning over the last decade. These machine learning (ML)-based scoring functions have improved accuracy over pre-existing methods. However, they have been primarily trained and validated using solved crystal structures of bound protein-ligand complexes. This does not accurately represent a real-world drug discovery scenario where crystal structures for the bound protein-ligand complexes are not available. Sometimes there may be no crystal structure of the protein at all, and modelled structures must be used. In addition, ML-based scoring functions are typically trained and validated using data from a single source and often fail to generalise to novel data sets. To discern between predictions from the scoring functions that result from a bias in the training data and those that do not, the model's reasoning must be fully understood. Current scoring functions do not provide this information and so have a reduced trust in the field to be used in drug discovery. Approaches such as attribution have shown promise in elucidating why the scoring function has made certain predictions. Attribution can be used to check whether predictions have been made based on the interactions between the protein and drug instead of just on the composition of the drug alone, a common pitfall for scoring functions. This DPhil aims to develop a binding affinity model using deep learning that has been explicitly designed to address the flaws currently in existing models in the field. Geometric deep learning architectures, such as equivariant graph neural networks, have been successfully utilised for this problem and combined with attribution to show that the scoring function does learn biomolecular interactions instead of bias to make predictions. These architectures will be built upon to extend the accuracy for modelled structures, not just experimentally determined crystal structures, to allow increased reliability for realistic applications of scoring functions. A scoring function that is accurate, reliable, and easily interpretable will be valuable in any drug discovery project and will be available for all as open-source software. The research will be completed in collaboration with the medical research charity LifeArc and their employees Dr Andy Merritt and Dr Kristian Birchall, who have valuable experience in accelerating healthcare innovation to make breakthroughs for patients. The project falls within the EPSRC AI and Data Science for Engineering, Health, and Government (ASG) research area.

Planned Impact

The main impact of the SABS CDT will be the difference made by the scientists trained within it, both during their DPhils and throughout their future careers.

The impact of the students during their DPhil should be measured by the culture change that the centre engenders in graduate training, in working at the interface between mathematical/physical sciences and the biomedical sciences, and in cross sector industry/academia working practices.

Current SABS projects are already changing the mechanisms of industry academic collaboration, for example as described by one of our Industrial Partners

"UCB and Roche are currently supervising a joint DPhil project and have put in two more joint proposals, which would have not been possible without the connections and the operational freedom offered by SABS-IDC and its open innovation culture, a one-of-the-kind in UK's CDTs."

New collaborations are also being generated: over 25% of current research projects are entirely new partnerships brokered by the Centre. The renewal of SABS will allow it to continue to strengthen and broaden this effect, building new bridges and starting new collaborations, and changing the culture of academic industrial partnerships. It will also continue to ensure that all of its research is made publically available through its Open Innovation structure, and help to create other centres with similar aims.

For all of our partners however, the students themselves are considered to be the ultimate output: as one our partners describes it,

"I believe the current SABS-IDC has met our original goals of developing young research scientists in a multidisciplinary environment with direct industrial experience and application. As a result, the graduating students have training and research experience that is directly applicable to the needs of modern lifescience R&D, in areas such as pharmaceuticals and biotechnology."

However, it is not only within the industrial realm that students have impact; in the later years of their DPhils, over 40% of SABS students, facilitated by the Centre, have undertaken various forms of public engagement. This includes visiting schools, working alongside Zooniverse to develop citizen science projects, and to produce educational resources in the area of crystal images. In the new Centre all students will be required to undertake outreach activities in order to increase engagement with the public.

The impact of the students after they have finished should be measured by how they carry on this novel approach to research, be it in the sector or outside it. As our industrial letters of support make clear, though no SABS students have yet completed their DPhils, there is a clear expectation that they will play a significant role in shaping the UK economy in the future. For example, as one of our partners comments about our students

"UCB has been in constant search for such talents, who would thrive in pharmaceutical research, but they are rare to find in conventional postgraduate programmes. Personally I am interested in recruiting SABS-IDC students to my group once they are ready for the job market."

To demonstrate the type of impact that SABS alumni will have, we consider the impact being made by the alumni of the i-DTC programmes from which this proposal has grown. Examples include two start-up companies, both of which already have investment in the millions. Several students also now hold senior positions in industry and in research facilities and institutes. They have also been named on 30 granted or pending patents, 15 of these arising directly from their DPhil work.

The examples of past success given above indicate the types of impact we expect the graduates from SABS to achieve, and offer clear evidence that SABS students will become future research leaders, driving innovation and changing research culture.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024093/1 01/10/2019 31/03/2028
2597682 Studentship EP/S024093/1 01/10/2021 30/09/2025 Guy Durant