AI assisted genomic profiling for the personalisation of treatment and control of infections

Lead Research Organisation: London School of Hygiene & Tropical Medicine
Department Name: Infectious and Tropical Diseases

Abstract

Cost-effective and rapid whole-genome sequencing (WGS) technologies are now being rolled-out in clinical settings to prevent disease, diagnose and personalise treatment of patients. WGS has become a routine, fast and affordable diagnostic tool used in infectious disease settings, revolutionizing clinical decision making, public health surveillance and infection control. This utility has been demonstrated during the COVID-19 pandemic, where rapid WGS of SARS-CoV-2 genomes has assisted the detection of clinically important variants (e.g., omicron), informed transmission dynamics, and aided vaccine development. More generally, analysis of WGS data can rapidly infer pathogen "virulent" strain-types, predict drug or antimicrobial resistance (AMR), and identify outbreaks. To assist this WGS-based analysis, molecular barcodes to profile pathogens for AMR, geographical source (e.g., for identification of importation events) and transmissibility can be derived and linked to fast informatic software tools. However, with increasing WGS use in clinical settings, there is a need for AI methods to mine the resulting big data to update barcodes and infer transmission dynamics in (near) real time. Our WGS profiling work in malaria and tuberculosis disease has established informative barcoding mutations, and developed world-leading informatics platforms (e.g., TB-Profiler) that have been applied globally (>100k tuberculosis bacteria with WGS, profiled across >35 countries). We have also applied AI methods (e.g., neural networks) to detect known and identify novel genes linked to AMR, thereby improving knowledge of underlying resistance mechanisms to improve barcodes. Here, we will integrate AI-based AMR mutation and transmission discovery tools into our informatic profiling software, making them dynamic and potentially improving clinical and infection control decision making. Working within established collaborations involving The UK Health Security Agency (UKHSA) and Health ministries in Asia (Philippines, Thailand, Vietnam), which are routinely using WGS-based diagnostics, we will implement the resulting AI-informatics platforms in UK and infectious disease endemic settings, with the potential of extending them to other infections, leading to associated health and economic benefits. Further, all WGS data generated, and AI and informatics software developed, will put in the public domain, leading to positive impacts in other biomedical research and healthcare areas.

Technical Summary

Infections and infectious diseases, including malaria and tuberculosis (TB), have high public health burden. Whole genome sequencing (WGS) can identify mutations in pathogen genomes that lead to drug resistance (AMR), pinpoint their geographical source, and define strain-types linked to virulence. We have developed informatics solutions that analyse large genomic variation databases and produce lists of mutations linked to AMR, strain-types, and location, enabling the profiling of Mycobacterium tuberculosis (TB-Profiler) and Plasmodium spp. (Malaria-Profiler) from WGS data. However, this approach uses a static database of mutations, requiring laborious manual curation to stay updated with the latest emerging mutations. To address this issue, we have developed several AI models to predict and reveal new mutations underlying AMR (e.g., Treesist-TB, DeepSweep), as well as assign geographical origin (neural network-based). However, implementing these models requires significant technical knowledge and presents a barrier to their use and evaluation in a clinical setting. Here, we propose to integrate the AI models and informatics profiling system, including through the automatic re-training and evaluation of models using our growing large WGS databases (>10k malaria, >50k TB) and newly generated data from collaborating sites across four countries (e.g., UK Health Security Agency malaria reference laboratory). All AI-enabled profiling tools will be made available through both command-line and web-based platforms, reports customised for local use, and WGS data and software put in the public domain. The tools can be extended for use in other types of infections (e.g., nosocomial). Ultimately, the resulting AI-enabled informatics systems will assist with personalisation of treatments within clinical management and aid surveillance activities through identifying outbreaks and transmission hotspots; thereby, leading to both health and economic benefits across populations.

Publications

10 25 50