Using whole genome sequencing to characterise drug resistant Mycobacterium tuberculosis in Thailand

Lead Research Organisation: London Sch of Hygiene and Trop Medicine
Department Name: Infectious and Tropical Diseases


Tuberculosis is an infectious disease that causes a high public health burden. The World Health Organisation estimates that there are ten million new cases and nearly two million deaths each year. Thailand is classified by WHO as one of the 22 countries in the world with the highest TB burden, with 93,000 new cases each year and an overall estimated TB prevalence of nearly 130,000 cases, 16 percent of whom are also HIV positive. Establishing who transmits to whom and where is fundamental to disease control. By comparing the genetic profiles of tuberculosis-causing bacteria within large population-based studies we can identify likely transmissions based on the similarity of the strains. Our work will focus on the Kanchanaburi province of Thailand, where there have been outbreaks of multi-drug resistant TB bacterial strains. This is a province where tuberculosis disease prevalence has been high compared other to regions in Thailand. Tuberculosis samples from patients (n=2,000) attending the largest hospital (Makarak Hospital), spanning a period from 2003 to 2019, will be genetically characterized, and important meta data (e.g. HIV status) curated. The inferred genetic profiles will use a higher proportion of the genome than used previously, allowing more accurate reconstructions of transmission chains. Factors influencing transmissibility will then be assessed directly by looking at the patterns of data within each transmission chain. These factors may be genetic variations within the bacteria themselves, and we will seek to identify those associated with drug resistance and transmissibility. Non-genetic factors affecting transmission, such as differences in the human hosts, including age, sex, diabetes, HIV infection and treatment, and contact patterns, will be investigated. We will attempt to confirm any insights into these factors using data collected and generated from other populations in Thailand and globally. Ultimately, an improved understanding of the genetic and other processes underlying transmissibility could lead to the development of improved control measures. These measures could include novel drug or vaccine targets, or identification of geographical or socially determined hotspots of transmission; thereby assisting the Thailand Ministry of Public Health with control strategies, as well as benefiting the Kanchanaburi province and wider Thailand.

Technical Summary

Tuberculosis, caused by Mycobacterium tuberculosis (Mtb), is an important global public health issue, with increasing drug resistance. Understanding the factors underlying Mtb transmission is essential for disease control, but surprisingly little is known about the role of pathogen genomic variation, where most infections occur, the importance of host factors, or who transmits to whom. Mtb can be grouped into 7 lineages or phylogenetic clades, and further into sub-lineages, which may vary in propensity to transmit and cause disease. By sequencing Mtb in a population-based setting it is possible to construct transmission chains using phylogenetic-based algorithms: strains with near identical genomes are most likely to be due to a transmission event, and will appear on the same branch of the phylogenetic tree. However, there are few large population-based studies in high TB prevalence areas with known drug resistance outbreaks that can apply long-term large-scale whole genome sequencing. Our work will focus on the Kanchanaburi province of Thailand, where there have been outbreaks of multi-drug resistant TB bacterial strains. Mtb strains from 2,000 patients attending the largest hospital (Makarak Hospital), during the period 2003 to 2019, will be whole genome sequenced and important meta data (e.g. HIV status, diabetes) curated. By including all genomic variants, it will be possible to build a probabilistic model of transmissions, and hence assess effects of pathogen variation and host factors (age, sex, HIV status, diabetes, proximity) on transmissibility using regression-based methods. Further, it will be possible to adopt a genome-wide approach to identify loci in Mtb that are associated with or under evolutionary selective pressure from drug resistance and transmissibility phenotypes. Loci identified as being associated with drug resistance and transmissibility will be validated, including by considering sequence data from other Thai or global populations.

Planned Impact

The economy
Advances in sequencing technology now allow the genomic characterization of M. tuberculosis (Mtb, the cause of tuberculosis (TB)) on an unprecedented scale, and have the potential to greatly accelerate research aimed at understanding the biology of the bacterium, its phylogeny and the epidemiology of the disease. The knowledge generated in the project and application of the research could ultimately benefit the pharmaceutical industry and those developing TB diagnostics and vaccines, as well as communities in the UK, Thailand, and other countries exposed to the disease. Ultimately, through reduced occurrence of TB, the knowledge gained in this study could improve the health and wealth of the UK, Thailand and globally. The methods used in this project could have application beyond TB, so help more widely in the control and prevention of infectious diseases in both humans and animals, with associated economic benefits.

The general public
Mtb is a major cause of disease, killing ~2 million people globally each year, and drug resistant forms of TB (such as in the Kanchanaburi province) and HIV are making control difficult. Genomics insights into transmission could lead ultimately to improved control measures adopted globally. The project therefore specifically addresses the MRC strategic aim to impact positively on global health, and to assist with bringing the health impacts of fundamental research to people more quickly.

Academic and industrial organisations
New sequencing technologies have the ability to generate vast amounts of data, but there is a need to translate this information into knowledge useable by other research scientists and industry. Our work will provide tools useful for genomic data analysis and modeling, which can be utilized across infectious diseases and in different settings. An understanding of genomic variation underlying transmission could lead to laboratory experiments for Mtb pathogenesis and host interaction, improved tests for detecting transmissible Mtb, and insights for academics involved in policy formulation. Scientific developments arising would enhance the commercial private sector for the production of diagnostics, vaccines and other control measures. We have links with some of these companies (e.g. GSK) and where required will use licensing agreements through the Mahidol University and LSHTM technology transfer offices to ensure pipelines to vaccine or other translation tool production and exploitation are in place. Developing a basic understanding of the genomic pathways in this study will not only be important for understanding virulence and transmission mechanisms in Mtb, but has practical applications for other mycobacteria including M.bovis - the cause of TB in humans and cows. Any technology developed may have enormous implications for policy makers for future disease outbreaks and impact on exports.

Training opportunities
The proposal will employ and train and develop a scientist with diverse experience with an 'omic mentality that can be applied in academia, the public sector and industry. The multidisciplinary project teams involved will add to the UK and Thailand science base in an important and economically vital research area. The researchers working on the project will develop team working and project management skills, which they can apply in all employment sectors. Importantly, the scope for multidisciplinary interactions in this proposal should not be underestimated. The researchers employed to carry out the planned activities will have unique opportunities for engagement with experts in TB biology, biotechnology, clinical care, genomic epidemiology, and public health, including within the LSHTM TB Centre and the Thai Ministry of Public Health. Thus, our proposal will impact on the creation of human resources across UK and Thailand that could subsequently be employed in challenging interdisciplinary projects in industry, academia and government.


10 25 50

publication icon
Higgins M (2019) PrimedRPA: primer design for recombinase polymerase amplification assays. in Bioinformatics (Oxford, England)

Description We now understand the genetic mutations underlying the transmission of multi-drug resistant TB strains in Central Thailand, which will assist the development of diagnostics. Using TB data from across Thailand, we have placed the Central Thailand strains into a wider context, and show the spread of TB across the country. Such insights are useful for infection control and policy development.
Exploitation Route The raw sequence data can be used by other research groups. The biological insights can assist the development of new diagnostics or vaccines. There will further funding opportunities looking at the genomic epidemiology of TB across Thailand.
Sectors Healthcare,Government, Democracy and Justice

Description They are assisting policy makers, including in the application of genomics for identifying hotspots of disease transmission and drug resistance, and therefore informing disease control. Specifically, the Thailand Ministry of Public Health are now sequencing all TB strains as a way of informing on circulating drug resistance mutations and TB transmission, thereby informing their control programs.
Sector Healthcare,Government, Democracy and Justice
Impact Types Cultural,Societal,Policy & public services

Title Bioinformatic pipelines and tools for processing genomic data 
Description We have developed and migrated algorithms for analysing genomic sequence data and the profiling of TB bacteria for drug resistance and outbreaks. These analyses link to a database of >20,000 TB isolates with whole genome sequencing data, including >2,000 from Thailand. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact We have in silico profiled more than 1,000 study isolates that we have generated sequence data for, and now understand the drug resistance patterns and underlying mutations. Further, we understand the transmission patterns of multi-drug resistant TB in central Thailand, which will assist stakeholders trying to control the circulating disease. 
Title Established and migrated an analytical pipeline 
Description We have migrated the bioinformatic and genomic data pipelines to PSAU. These pipelines process raw sequencing data into transmission chains, which are used for epidemiological analysis. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Provided the PSAU with the ability to analyse genomic data. 
Title Whole genome sequencing and drug resistance database 
Description This includes whole genome sequencing and drug assay data from >18,000 TB samples 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact The identification of drug resistance markers will personalise the care of TB patents. 
Title Whole genome sequencing data 
Description We have generated whole genome sequencing data for more than 1,000 study isolates. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact We have generated raw sequences that will be made available to research community. 
Description Genome sequencing and epidemiology with Khon Kaen University 
Organisation Khon Kaen University
Country Thailand 
Sector Academic/University 
PI Contribution Capacity building in TB genomic data analysis. The delivery of phylogenetic, transmission and genome-wide association study analyses.
Collaborator Contribution Samples for genome sequencing. Insights into TB control programs and policy in Thailand.
Impact Genome sequencing data, and several joint scientific publications.
Start Year 2019
Description Thailand Ministry of Public Health 
Organisation Government of Thailand
Country Thailand 
Sector Public 
PI Contribution We have been providing support to their study design and control programme, as well as analytical support to their genomic data analysis.
Collaborator Contribution They are providing genomic and meta data, and disease control and operational insights.
Impact Building of capacity in genomic investigations, and plans for the development of a TB genomic network.
Start Year 2018
Description Asian TB Network meeting in Manilla 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Study participants or study members
Results and Impact This was a meeting of Southeast Asian TB researchers and policy makers, with an interest in applying genomic tools for diagnosis, as well as the development of a joint genomic database across the region to assist with identifying important mutations for drug resistance and transmission. The meeting was hosted by the RITM in Manilla (March 2019).
Year(s) Of Engagement Activity 2019
Description Networking meeting in Southeast Asia 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Study participants or study members
Results and Impact We had a meeting in the Philippines that involved the Thai partners, and we agreed to share protocols, and establish a TB genomics network.
Year(s) Of Engagement Activity 2018