Antibody repertoire analytics: the development of a visualization toolkit and its application to high-throughput Ig-seq data
Lead Research Organisation:
Birkbeck, University of London
Department Name: Biological Sciences
Abstract
NGS of expressed antibodies (Abs), known as Ig-seq, is increasingly used to address fundamental immunological questions, such as: How do immune responses to pathogens, vaccines and therapeutic Abs differ? And what are the differences between healthy Ab repertoires and those associated with immune dysfunction? A typical Ig-seq dataset consists of millions of partial mRNA Ab sequences derived from multiple samples, each from a single individual and timepoint. There are methods for analysing aspects of this data, including tools for annotating individual Abs and the relationships between them (e.g. www.imgt.org). However, tools that support interactive engagement with Ig-seq datasets do not exist.
The aim of this project is to develop a toolkit for visually analysing Ig-seq, enabling biologists to rapidly explore their data, generate novel hypotheses and make new discoveries. The priority will be to support whole Ab repertoire analytics and to characterize fundamental properties (repertoire diversity, Ab convergence within and between repertoires, and the dynamics of repertoire evolution) relevant to vaccine design and therapeutic Ab discovery. For example, two strategies are used to generate Ab diversity prior to screening for therapeutic Abs: animal immunization, and phage display. In both cases, the right ways to measure diversity, and to elicit it, are poorly understood. Here the toolkit can play a vital role, engendering a better understanding of animal repertoire evolution in response to challenge (ultimately leading to better immunization strategies), and helping to identify (and ultimately fill) gaps in existing Ab phage libraries.
The student on this project will engage in two complementary activities: tool development and immunology research. Toolkit development poses several challenges. To facilitate rich analyses, the toolkit will incorporate different modes of visual representations (e.g. networks, density plots) with multiple similarity metrics (e.g. genetic distance, CDR3 sequence identity) and provide essential operations for comparing repertoires. The latter is particularly challenging because each sample captures a small proportion of repertoire diversity, hence the number of shared sequences is a poor estimate of the true, in vivo overlap; and sample quality varies, but the optimal normalization strategy for addressing this is unclear. The toolkit will also need to handle large repertoire datasets of different types (e.g. paired vs. unpaired heavy/light chains) and in different formats (e.g. IMGT vs. Kabat numbering schemes).
The student will undertake immunology research with UCB from an early stage of the PhD, helping to design experiments that improve phage display library construction and contributing to the interpretation of experimental data from immunized animals. At Birkbeck the student will analyse Ab repertoires arising from an existing vaccine design collaboration, and undertake meta-analyses of public Ig-seq datasets, including cross-species analyses focusing on models of human immunity.
This project falls squarely within the BBSRC strategic priority of "data driven biology". The proposed analyses of animal Ab repertoires are designed to help scientists make informed choices about the animals they use in immunology research (relevant to the "3Rs" priority); and meta-analyses of samples from healthy individuals of various ages will be undertaken (relevant to "healthy ageing across the lifecourse").
The aim of this project is to develop a toolkit for visually analysing Ig-seq, enabling biologists to rapidly explore their data, generate novel hypotheses and make new discoveries. The priority will be to support whole Ab repertoire analytics and to characterize fundamental properties (repertoire diversity, Ab convergence within and between repertoires, and the dynamics of repertoire evolution) relevant to vaccine design and therapeutic Ab discovery. For example, two strategies are used to generate Ab diversity prior to screening for therapeutic Abs: animal immunization, and phage display. In both cases, the right ways to measure diversity, and to elicit it, are poorly understood. Here the toolkit can play a vital role, engendering a better understanding of animal repertoire evolution in response to challenge (ultimately leading to better immunization strategies), and helping to identify (and ultimately fill) gaps in existing Ab phage libraries.
The student on this project will engage in two complementary activities: tool development and immunology research. Toolkit development poses several challenges. To facilitate rich analyses, the toolkit will incorporate different modes of visual representations (e.g. networks, density plots) with multiple similarity metrics (e.g. genetic distance, CDR3 sequence identity) and provide essential operations for comparing repertoires. The latter is particularly challenging because each sample captures a small proportion of repertoire diversity, hence the number of shared sequences is a poor estimate of the true, in vivo overlap; and sample quality varies, but the optimal normalization strategy for addressing this is unclear. The toolkit will also need to handle large repertoire datasets of different types (e.g. paired vs. unpaired heavy/light chains) and in different formats (e.g. IMGT vs. Kabat numbering schemes).
The student will undertake immunology research with UCB from an early stage of the PhD, helping to design experiments that improve phage display library construction and contributing to the interpretation of experimental data from immunized animals. At Birkbeck the student will analyse Ab repertoires arising from an existing vaccine design collaboration, and undertake meta-analyses of public Ig-seq datasets, including cross-species analyses focusing on models of human immunity.
This project falls squarely within the BBSRC strategic priority of "data driven biology". The proposed analyses of animal Ab repertoires are designed to help scientists make informed choices about the animals they use in immunology research (relevant to the "3Rs" priority); and meta-analyses of samples from healthy individuals of various ages will be undertaken (relevant to "healthy ageing across the lifecourse").
People |
ORCID iD |
Adrian Shepherd (Primary Supervisor) | |
Pejvak Moghimi (Student) |
Publications
Olson BJ
(2019)
sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation.
in Frontiers in immunology
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
BB/M009513/1 | 30/09/2015 | 31/03/2024 | |||
1917717 | Studentship | BB/M009513/1 | 30/09/2017 | 29/03/2022 | Pejvak Moghimi |
Description | An immunotherapy approach for developing a vaccine to target multiple cancer types |
Organisation | King's College London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | In our bioinformatics analysis of the sequence data of the T cell receptor repertoires collected from the 8 patients in this study, we demonstrated that there is a convergence, at the sequence level, amongst the T cell receptors of the responder patients. This was shown by calculating the pairwise Hamming distance between sequences within each patient's repertoire and accumulated over all patients. |
Collaborator Contribution | Our partners designed an immunotherapeutic vaccine, which directs the activity of the adaptive immune system towards development of T cells containing receptors with the ability to binding the 9mer peptide fragments which comprise the main component of the vaccine. Furthermore, they carried out clinical investigations of the efficacy of the vaccine. |
Impact | The results from our work signify that the vaccine has been effective in treating multiple types of cancer and we are soon to submit our findings for publication. In this multi-disciplinary research we combined clinical, genomics and bioinformatics analysis. |
Start Year | 2019 |
Description | sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation |
Organisation | Fred Hutchinson Cancer Research Center (FHCRC) |
Country | United States |
Sector | Academic/University |
PI Contribution | In this collaboration, which was on developing an R package for statistical characterisation of antibody and T cell receptor repertoires as well as simulating repertoires, I tested our tool on characterising real repertoires by using multiple annotations tools which form the backend of this tool, as well as contributing to the source code for the software. |
Collaborator Contribution | Our partners wrote the majority of the source code for this tool. |
Impact | publication found in the link above. |
Start Year | 2018 |
Title | sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation |
Description | A bioinformatics R package for statistical characterisation of antibody and T cell receptor repertoires. |
Type Of Technology | Software |
Year Produced | 2019 |
Impact | This tool is the first of its kind in the field and ensembles a variety of techniques which could make it a standard in the field. |
URL | https://www.frontiersin.org/articles/10.3389/fimmu.2019.02533/full |