📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Next generation Text Mining in Drug Discovery

Lead Research Organisation: Queen Mary University of London
Department Name: Digital Environment Research Institute

Abstract

Extracting interesting and non-trivial patterns from text documents is the next-generation wave of knowledge discovery in biochemical sciences. Free text resident in biomedical literature contains a wealth of information about small molecules and their targets that is not currently stored in biochemical knowledgebases. This information can be exploited to identify and build specific signatures for drug-gene associations, chemical and biological toxicity and even adverse drug effects.

Recent advances in embedding methods have shown promising results for several biomedical and clinical tasks. Text classification performed on biomedical records poses specific challenges including dataset imbalance, miss-spellings, abbreviations or semantic ambiguity. Current state-of-the-art approaches apply deep learning to the task, mainly convolutional neural network (CNN), recurrent neural network (RNN), bi-directional long short term memory (Bi-LSTM), and BERT (Devlin et al.,2019; Wolf et al.,20).

This project will contribute towards Exscientia' existing text mining platform by optimising named entity recognition (NER) procedures and applying novel machine learning strategies to generate your own semantic lexicon. It will have access to expertise across Discovery and AI technology teams to advise/support during the project.

People

ORCID iD

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/X511833/1 01/12/2022 30/11/2026
2760490 Studentship BB/X511833/1 01/12/2022 30/11/2026