AI assisted genomic profiling for the personalisation of treatment and control of infections

Lead Research Organisation: London Sch of Hygiene & Tropic. Medicine
Department Name: Infectious and Tropical Diseases

Abstract

Cost-effective and rapid whole-genome sequencing (WGS) technologies are now being rolled-out in clinical settings to prevent disease, diagnose and personalise treatment of patients. WGS has become a routine, fast and affordable diagnostic tool used in infectious disease settings, revolutionizing clinical decision making, public health surveillance and infection control. This utility has been demonstrated during the COVID-19 pandemic, where rapid WGS of SARS-CoV-2 genomes has assisted the detection of clinically important variants (e.g., omicron), informed transmission dynamics, and aided vaccine development. More generally, analysis of WGS data can rapidly infer pathogen "virulent" strain-types, predict drug or antimicrobial resistance (AMR), and identify outbreaks. To assist this WGS-based analysis, molecular barcodes to profile pathogens for AMR, geographical source (e.g., for identification of importation events) and transmissibility can be derived and linked to fast informatic software tools. However, with increasing WGS use in clinical settings, there is a need for AI methods to mine the resulting big data to update barcodes and infer transmission dynamics in (near) real time. Our WGS profiling work in malaria and tuberculosis disease has established informative barcoding mutations, and developed world-leading informatics platforms (e.g., TB-Profiler) that have been applied globally (>100k tuberculosis bacteria with WGS, profiled across >35 countries). We have also applied AI methods (e.g., neural networks) to detect known and identify novel genes linked to AMR, thereby improving knowledge of underlying resistance mechanisms to improve barcodes. Here, we will integrate AI-based AMR mutation and transmission discovery tools into our informatic profiling software, making them dynamic and potentially improving clinical and infection control decision making. Working within established collaborations involving The UK Health Security Agency (UKHSA) and Health ministries in Asia (Philippines, Thailand, Vietnam), which are routinely using WGS-based diagnostics, we will implement the resulting AI-informatics platforms in UK and infectious disease endemic settings, with the potential of extending them to other infections, leading to associated health and economic benefits. Further, all WGS data generated, and AI and informatics software developed, will put in the public domain, leading to positive impacts in other biomedical research and healthcare areas.

Technical Summary

Infections and infectious diseases, including malaria and tuberculosis (TB), have high public health burden. Whole genome sequencing (WGS) can identify mutations in pathogen genomes that lead to drug resistance (AMR), pinpoint their geographical source, and define strain-types linked to virulence. We have developed informatics solutions that analyse large genomic variation databases and produce lists of mutations linked to AMR, strain-types, and location, enabling the profiling of Mycobacterium tuberculosis (TB-Profiler) and Plasmodium spp. (Malaria-Profiler) from WGS data. However, this approach uses a static database of mutations, requiring laborious manual curation to stay updated with the latest emerging mutations. To address this issue, we have developed several AI models to predict and reveal new mutations underlying AMR (e.g., Treesist-TB, DeepSweep), as well as assign geographical origin (neural network-based). However, implementing these models requires significant technical knowledge and presents a barrier to their use and evaluation in a clinical setting. Here, we propose to integrate the AI models and informatics profiling system, including through the automatic re-training and evaluation of models using our growing large WGS databases (>10k malaria, >50k TB) and newly generated data from collaborating sites across four countries (e.g., UK Health Security Agency malaria reference laboratory). All AI-enabled profiling tools will be made available through both command-line and web-based platforms, reports customised for local use, and WGS data and software put in the public domain. The tools can be extended for use in other types of infections (e.g., nosocomial). Ultimately, the resulting AI-enabled informatics systems will assist with personalisation of treatments within clinical management and aid surveillance activities through identifying outbreaks and transmission hotspots; thereby, leading to both health and economic benefits across populations.

Publications

10 25 50

 
Description Infection-AID: AI assisted genomic profiling to inform the Diagnosis, personalised treatment and control of infections
Amount £518,745 (GBP)
Funding ID EP/Y018842/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 09/2023 
End 04/2025
 
Title AI methods and analysis pipelines 
Description We have established AI containers to assist analysis, as well as automated data capture systems of sequence data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact This pipeline and framework can be used to implement and compare AI models. 
 
Title Malaria-Profiler 
Description This informatics tool profiles malaria parasite sequence data to infer species, drug resistance and geographical source. It has been implemented on Vietnamese data, and the software is being used in Vietnam. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact It strengthens capacity for the analysis of sequence data, and is useful for clinical and surveillance applications. 
URL https://bioinformatics.lshtm.ac.uk/malaria-profiler/
 
Description With Thailand MOPH 
Organisation Ministry of Public Health
Country Thailand 
Sector Public 
PI Contribution We contributed AI and bioinformatics pipelines and analysis to the TB sequence data generated, including for the profiling of infections for drug resistance (building on our TB-Profiler tool). We have automated the reporting from the algorithms in the Thai language.
Collaborator Contribution Sequence data and implementation of algorithms, with feedback on performance.
Impact Sequence data, genomic profiling of TB drug resistance, and AI approaches to identify mutations of resistance. This work will assist the personalising of treatment in TB clinics run by the Thailand MOPH, which have invested in genomic platforms.
Start Year 2022
 
Description With UKHSA for genome sequencing 
Organisation Public Health England
Country United Kingdom 
Sector Public 
PI Contribution Sequencing of parasite DNA, and analysis of genomics data
Collaborator Contribution Contributing Plasmodium DNA from UK travellers
Impact Plasmodium sequence data, and identification of mutations linked to drug resistance.
Start Year 2022
 
Description Genomics and AI training 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This was a AI/genomics workshop run by LSHTM, hosted by RVC.
Year(s) Of Engagement Activity 2023