Antibody repertoire analytics: the development of a visualization toolkit and its application to high-throughput Ig-seq data

Lead Research Organisation: Birkbeck, University of London
Department Name: Biological Sciences

Abstract

NGS of expressed antibodies (Abs), known as Ig-seq, is increasingly used to address fundamental immunological questions, such as: How do immune responses to pathogens, vaccines and therapeutic Abs differ? And what are the differences between healthy Ab repertoires and those associated with immune dysfunction? A typical Ig-seq dataset consists of millions of partial mRNA Ab sequences derived from multiple samples, each from a single individual and timepoint. There are methods for analysing aspects of this data, including tools for annotating individual Abs and the relationships between them (e.g. www.imgt.org). However, tools that support interactive engagement with Ig-seq datasets do not exist.

The aim of this project is to develop a toolkit for visually analysing Ig-seq, enabling biologists to rapidly explore their data, generate novel hypotheses and make new discoveries. The priority will be to support whole Ab repertoire analytics and to characterize fundamental properties (repertoire diversity, Ab convergence within and between repertoires, and the dynamics of repertoire evolution) relevant to vaccine design and therapeutic Ab discovery. For example, two strategies are used to generate Ab diversity prior to screening for therapeutic Abs: animal immunization, and phage display. In both cases, the right ways to measure diversity, and to elicit it, are poorly understood. Here the toolkit can play a vital role, engendering a better understanding of animal repertoire evolution in response to challenge (ultimately leading to better immunization strategies), and helping to identify (and ultimately fill) gaps in existing Ab phage libraries.

The student on this project will engage in two complementary activities: tool development and immunology research. Toolkit development poses several challenges. To facilitate rich analyses, the toolkit will incorporate different modes of visual representations (e.g. networks, density plots) with multiple similarity metrics (e.g. genetic distance, CDR3 sequence identity) and provide essential operations for comparing repertoires. The latter is particularly challenging because each sample captures a small proportion of repertoire diversity, hence the number of shared sequences is a poor estimate of the true, in vivo overlap; and sample quality varies, but the optimal normalization strategy for addressing this is unclear. The toolkit will also need to handle large repertoire datasets of different types (e.g. paired vs. unpaired heavy/light chains) and in different formats (e.g. IMGT vs. Kabat numbering schemes).

The student will undertake immunology research with UCB from an early stage of the PhD, helping to design experiments that improve phage display library construction and contributing to the interpretation of experimental data from immunized animals. At Birkbeck the student will analyse Ab repertoires arising from an existing vaccine design collaboration, and undertake meta-analyses of public Ig-seq datasets, including cross-species analyses focusing on models of human immunity.

This project falls squarely within the BBSRC strategic priority of "data driven biology". The proposed analyses of animal Ab repertoires are designed to help scientists make informed choices about the animals they use in immunology research (relevant to the "3Rs" priority); and meta-analyses of samples from healthy individuals of various ages will be undertaken (relevant to "healthy ageing across the lifecourse").

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
BB/M009513/1 01/10/2015 31/03/2024
1917717 Studentship BB/M009513/1 01/10/2017 30/03/2022 Pejvak Moghimi
 
Description An immunotherapy approach for developing a vaccine to target multiple cancer types 
Organisation King's College London
Country United Kingdom 
Sector Academic/University 
PI Contribution In our bioinformatics analysis of the sequence data of the T cell receptor repertoires collected from the 8 patients in this study, we demonstrated that there is a convergence, at the sequence level, amongst the T cell receptors of the responder patients. This was shown by calculating the pairwise Hamming distance between sequences within each patient's repertoire and accumulated over all patients.
Collaborator Contribution Our partners designed an immunotherapeutic vaccine, which directs the activity of the adaptive immune system towards development of T cells containing receptors with the ability to binding the 9mer peptide fragments which comprise the main component of the vaccine. Furthermore, they carried out clinical investigations of the efficacy of the vaccine.
Impact The results from our work signify that the vaccine has been effective in treating multiple types of cancer and we are soon to submit our findings for publication. In this multi-disciplinary research we combined clinical, genomics and bioinformatics analysis.
Start Year 2019
 
Description sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation 
Organisation Fred Hutchinson Cancer Research Center (FHCRC)
Country United States 
Sector Academic/University 
PI Contribution In this collaboration, which was on developing an R package for statistical characterisation of antibody and T cell receptor repertoires as well as simulating repertoires, I tested our tool on characterising real repertoires by using multiple annotations tools which form the backend of this tool, as well as contributing to the source code for the software.
Collaborator Contribution Our partners wrote the majority of the source code for this tool.
Impact publication found in the link above.
Start Year 2018
 
Title sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation 
Description A bioinformatics R package for statistical characterisation of antibody and T cell receptor repertoires. 
Type Of Technology Software 
Year Produced 2019 
Impact This tool is the first of its kind in the field and ensembles a variety of techniques which could make it a standard in the field. 
URL https://www.frontiersin.org/articles/10.3389/fimmu.2019.02533/full