Data-Driven Lead Optimisation for Drug Discovery

Lead Research Organisation: University of Sheffield
Department Name: Information School

Abstract

Research questions:
This research provides a lead optimisation tool that allows chemists to explore the chemical space that has previously been worked in for a certain target. This tool should then be able to suggest molecules or areas of chemical space that should be explored with reasoning. This should allow the chemist to have a greater understanding as to why a certain molecule has been selected to explore for a certain target. The main molecular representation that will be used for the tool is reduced graphs, which is a graphical representation that has been reduced down to the key interacting nodes. The visualisation tool should be an interactive interface that the chemists can have an overall view of the chemical space whilst also still allowing them to complex details of each molecule.

Original methodology:
The methodology that has been done so far has been to reduce the chemical structures down into reduced graphs. These reduced graphs have been made so they are customisable for the user as they can set different parameters and different definitions depending on what they are looking for and what they find key. The next step was to then find the maximum common substructure (MCS), both the connected and disconnected versions, using a python module RDKit. These MCS' are found so that some clustering techniques can occur as they are based upon similarity or dissimilarity scores which are found using the MCS in the Tanimoto coefficient equation. Several different clustering techniques are then performed and cluster validity techniques are applied in order to establish the best clusters for that dataset. A core reduced graph of these clusters are found to aid the visualisation technique. A visualisation is then produced that summaries this information, the chemical space that has been worked in. From here a method needs to be established that enables a reduced graph to be converted back into a chemical graph. This chemical graph will be ran in an activity model to see how well this molecule should perform and if the new prediction of activity is adequate then the molecule will be put forward as a suggestion. The visualisation will then be used to back up this suggestion.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509735/1 01/10/2016 30/09/2021
1960258 Studentship EP/N509735/1 01/10/2017 30/09/2021 Jessica Stacey
 
Description Lecture at Spring 2021 American Chemical Society 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Gave a lecture on 'Using Reduced Graphs to Cultivate Lead Optimisation Series' to people in industry and academia. The new visualisation that has been created within the PhD was presented alongside two new scores that have been generated to measure the extent of the exploration and exploitation in chemical space within the lead optimisation series and the impact a new molecule could have. There were enquiries about getting this method published and having potential access to the source code.
Year(s) Of Engagement Activity 2021
URL https://www.acscinf.org/meetingsevents/meeting-archive/acs-spring-meeting-2021
 
Description Lecture at the 16th German Conference on Chemoinformatics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Gave a lecture on 'Using Reduced Graphs to Aid Lead Optimisation Projects' to people in industry and academia. A new visualisation was presented which caused questions and discussions to arise on the method on how it could be helpful for chemists as well as potential improvements to be made. There were enquiries about getting the source code as the audience could see the potential. Therefore, the hope is to publish the method soon.
Year(s) Of Engagement Activity 2020
URL https://veranstaltungen.gdch.de/tms/frontend/index.cfm?l=10731&sp_id=1&selSiteID=sciprog_v2
 
Description Poster Presentation at Women in Chemistry Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presented a poster on 'Enhancing the Lead Optimisation Process Using Data Mining Approaches and Reduced Graphs' 'to an audience of chemists that are not all in the same area as myself. This allowed a further outreach of the work undertaken in this PhD to a wider audience.
Year(s) Of Engagement Activity 2021