Machine Learning to improve the outputs of an antibody synthetic library

Lead Research Organisation: University of Oxford
Department Name: Sustain Approach to Biomedical Sci CDT

Abstract

Antibodies are essential proteins of the adaptive immune system that bind to their target proteins, called antigens, with great specificity and affinity. Antibodies are one of the most important classes of pharmaceuticals with over 100 antibody therapeutics approved. However, the majority of therapeutic antibody candidates fail before regulatory approval and the antibodies that do eventually progress cost more than $2bn to develop and can typically take around 10 years to bring to market. This time-consuming and cost-intensive process of developing these therapeutics would benefit from computational and machine learning-driven methods for predicting antibody properties. This offers an immense promise for the successful development of next-generation biologics.

This project will focus on machine learning techniques to select antibodies with the best biophysical properties. The project will be performed alongside experts in the field of applying machine learning to antibody development at the Oxford Protein Informatics Group of Professor Charlotte Deane and in collaboration with Fusion Antibodies. Fusion Antibodies have been experts in the antibody space for more than 20 years. Besides from harnessing this expertise, Fusion Antibodies will provide a wealth of valuable data linking antibody sequences to expression and various biophysical properties. This data will be used to design and create machine learning protocols and will be beneficial for this project as these types of datasets are rarely available to the public. Furthermore, experimental validation of algorithms could be performed by the company.

This project will initially focus on the biophysical property immunogenicity and expression. Animal models are often used to derive therapeutic antibodies potentially leading to an immune response in humans. Humanising is the process of making to antibody more human-like while maintaining antibody expression levels and its ability to bind their target epitope with high affinity. To humanise the antibody the complementarity-determining region, the antibody variable domain involved in binding the target, of the animal derived antibody is grafted into a human framework. Fusion Antibodies has performed this task, and applied structure- and expertise-driven back mutations, for a set of targets. In order to generate a predictive algorithm of this antibody property of interest based on this dataset the value of the dataset should be evaluated. This should provide insights into volume, diversity, and the type of data needed to generalise predictions.

This project will contribute to maximizing efficacy in a shorter timeframe and at a reduced cost by guiding antibody therapeutic development towards antibodies with better biophysical properties. Machine learning and artificial intelligence technologies would be used to create predictive algorithms for biophysical properties. The project is interdisciplinary and involves immunoinformatics, machine learning, (structure-based) antibody design, as well as experimental validations. Therefore, this project falls within the EPSRC research areas: Chemical biology and biological chemistry, Synthetic biology, Biological informatics, and Artificial intelligence technologies.

Planned Impact

The UK's world-leading position in biomedical research is critically dependent upon training scientists with the cutting-edge research skills and technological know-how needed to drive future scientific advances. Since 2009, the EPSRC and MRC CDT in Systems Approaches to Biomedical Science (SABS) has been working with its consortium of 22 industrial and institutional partners to meet this training need.

Over this period, our partners have identified a growing training need caused by the increasing reliance on computational approaches and research software. The new EPSRC CDT in Sustainable Approaches to Biomedical Science: Responsible and Reproducible Research - SABS:R^3 will address this need. By embedding a sustainable approach to software and computational model development into all aspects of the existing SABS training programme, we aim to foster a culture change in how the computational tools and research software that now underpin much of biomedical research are developed, and hence how quantitative and predictive translational biomedical research is undertaken.

As with all CDT Programmes, the future impact of SABS:R^3 will be through its alumni, and by the culture change that its training engenders. By these measures, our existing SABS CDT is already proving remarkably successful. Our alumni have gone on to a wide range of successful careers, 21 in academic research, 19 in industry (including 5 in SABS partner companies) and the other 10 working in organisations from the Office of National Statistics to the EPSRC. SABS' unique Open Innovation framework has facilitated new company connections and a high level of operational freedom, facilitating 14 multi-company, pre-competitive, collaborative doctoral research projects between 11 companies, each focused on a SABS student.

The impact of sustainable and open computational approaches on biomedical research is clear from existing SABS' student projects. Examples include SAbDab which resulted from the first-ever co-sponsored doctorate in SABS, by UCB and Roche. It was released as open source software, is embedded in the pipelines of several pharmaceutical companies (including UCB, Medimmune, GSK, and Lonza) and has resulted in 13 papers. The SABS student who developed SAbDab was initially seconded to MedImmune, sponsored by EPSRC IAA funding; he went on to work at Roche, and is now at BenevolentAI. Similarly, PanDDA, multi-dataset X-ray crystallographic software to detect ligand-bound states in protein complexes is in CCP4 and is an integral part of Diamond Light Source's XChem Pipeline. The SABS student who developed PanDDA was awarded an EMBO Fellowship.

Future SABS:R^3 students will undertake research supported by both our industrial partners and academic supervisors. These supervisors have a strong track record of high impact research through the release of open source software, computational tools, and databases, and through commercialisation and licensing of their research. All of this research has been undertaken in collaboration with industrial partners, with many examples of these tools now in routine use within partner companies.

The newly focused SABS:R^3 will permit new industrial collaborations. Six new partners have joined the consortium to support this new bid, ranging from major multinationals (e.g. Unilever) to SMEs (e.g. Lhasa). SABS:R^3 will continue to make all of its research and teaching resources publicly available and will continue to help to create other centres with similar aims. To promote a wider cultural change, the SABS:R^3 will also engage with the academic publishing industry (Elsevier, OUP, and Taylor & Francis). We will explore novel ways of disseminating the outputs of computational biomedical research, to engender trust in the released tools and software, facilitate more uptake and re-use.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024093/1 01/10/2019 31/03/2028
2736498 Studentship EP/S024093/1 01/10/2022 30/09/2026 Henriette Capel