AI assisted genomic profiling for the personalisation of treatment and control of infections
Lead Research Organisation:
London Sch of Hygiene & Tropic. Medicine
Department Name: Infectious and Tropical Diseases
Abstract
Cost-effective and rapid whole-genome sequencing (WGS) technologies are now being rolled-out in clinical settings to prevent disease, diagnose and personalise treatment of patients. WGS has become a routine, fast and affordable diagnostic tool used in infectious disease settings, revolutionizing clinical decision making, public health surveillance and infection control. This utility has been demonstrated during the COVID-19 pandemic, where rapid WGS of SARS-CoV-2 genomes has assisted the detection of clinically important variants (e.g., omicron), informed transmission dynamics, and aided vaccine development. More generally, analysis of WGS data can rapidly infer pathogen "virulent" strain-types, predict drug or antimicrobial resistance (AMR), and identify outbreaks. To assist this WGS-based analysis, molecular barcodes to profile pathogens for AMR, geographical source (e.g., for identification of importation events) and transmissibility can be derived and linked to fast informatic software tools. However, with increasing WGS use in clinical settings, there is a need for AI methods to mine the resulting big data to update barcodes and infer transmission dynamics in (near) real time. Our WGS profiling work in malaria and tuberculosis disease has established informative barcoding mutations, and developed world-leading informatics platforms (e.g., TB-Profiler) that have been applied globally (>100k tuberculosis bacteria with WGS, profiled across >35 countries). We have also applied AI methods (e.g., neural networks) to detect known and identify novel genes linked to AMR, thereby improving knowledge of underlying resistance mechanisms to improve barcodes. Here, we will integrate AI-based AMR mutation and transmission discovery tools into our informatic profiling software, making them dynamic and potentially improving clinical and infection control decision making. Working within established collaborations involving The UK Health Security Agency (UKHSA) and Health ministries in Asia (Philippines, Thailand, Vietnam), which are routinely using WGS-based diagnostics, we will implement the resulting AI-informatics platforms in UK and infectious disease endemic settings, with the potential of extending them to other infections, leading to associated health and economic benefits. Further, all WGS data generated, and AI and informatics software developed, will put in the public domain, leading to positive impacts in other biomedical research and healthcare areas.
Technical Summary
Infections and infectious diseases, including malaria and tuberculosis (TB), have high public health burden. Whole genome sequencing (WGS) can identify mutations in pathogen genomes that lead to drug resistance (AMR), pinpoint their geographical source, and define strain-types linked to virulence. We have developed informatics solutions that analyse large genomic variation databases and produce lists of mutations linked to AMR, strain-types, and location, enabling the profiling of Mycobacterium tuberculosis (TB-Profiler) and Plasmodium spp. (Malaria-Profiler) from WGS data. However, this approach uses a static database of mutations, requiring laborious manual curation to stay updated with the latest emerging mutations. To address this issue, we have developed several AI models to predict and reveal new mutations underlying AMR (e.g., Treesist-TB, DeepSweep), as well as assign geographical origin (neural network-based). However, implementing these models requires significant technical knowledge and presents a barrier to their use and evaluation in a clinical setting. Here, we propose to integrate the AI models and informatics profiling system, including through the automatic re-training and evaluation of models using our growing large WGS databases (>10k malaria, >50k TB) and newly generated data from collaborating sites across four countries (e.g., UK Health Security Agency malaria reference laboratory). All AI-enabled profiling tools will be made available through both command-line and web-based platforms, reports customised for local use, and WGS data and software put in the public domain. The tools can be extended for use in other types of infections (e.g., nosocomial). Ultimately, the resulting AI-enabled informatics systems will assist with personalisation of treatments within clinical management and aid surveillance activities through identifying outbreaks and transmission hotspots; thereby, leading to both health and economic benefits across populations.
Publications

Acford-Palmer H
(2023)
Detection of insecticide resistance markers in Anopheles funestus from the Democratic Republic of the Congo using a targeted amplicon sequencing panel.
in Scientific reports









Billows N
(2023)
Feature weighted models to address lineage dependency in drug-resistance prediction from Mycobacterium tuberculosis genome sequences.
in Bioinformatics (Oxford, England)
Description | Infection-AID: AI assisted genomic profiling to inform the Diagnosis, personalised treatment and control of infections |
Amount | £518,745 (GBP) |
Funding ID | EP/Y018842/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2023 |
End | 04/2025 |
Title | AI methods and analysis pipelines |
Description | We have established AI containers to assist analysis, as well as automated data capture systems of sequence data. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | This pipeline and framework can be used to implement and compare AI models. |
Title | Malaria-Profiler |
Description | This informatics tool profiles malaria parasite sequence data to infer species, drug resistance and geographical source. It has been implemented on Vietnamese data, and the software is being used in Vietnam. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | It strengthens capacity for the analysis of sequence data, and is useful for clinical and surveillance applications. |
URL | https://bioinformatics.lshtm.ac.uk/malaria-profiler/ |
Description | With Thailand MOPH |
Organisation | Ministry of Public Health |
Country | Thailand |
Sector | Public |
PI Contribution | We contributed AI and bioinformatics pipelines and analysis to the TB sequence data generated, including for the profiling of infections for drug resistance (building on our TB-Profiler tool). We have automated the reporting from the algorithms in the Thai language. |
Collaborator Contribution | Sequence data and implementation of algorithms, with feedback on performance. |
Impact | Sequence data, genomic profiling of TB drug resistance, and AI approaches to identify mutations of resistance. This work will assist the personalising of treatment in TB clinics run by the Thailand MOPH, which have invested in genomic platforms. |
Start Year | 2022 |
Description | With UKHSA for genome sequencing |
Organisation | Public Health England |
Country | United Kingdom |
Sector | Public |
PI Contribution | Sequencing of parasite DNA, and analysis of genomics data |
Collaborator Contribution | Contributing Plasmodium DNA from UK travellers |
Impact | Plasmodium sequence data, and identification of mutations linked to drug resistance. |
Start Year | 2022 |
Description | Genomics and AI training |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This was a AI/genomics workshop run by LSHTM, hosted by RVC. |
Year(s) Of Engagement Activity | 2023 |