GREET: Generative Recombinant Enzyme Engineering for Therapeutics
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Biological Sciences
Abstract
Enzymes are proteins catalysing almost all reactions required for cellular life and, when defective, they can cause severe pathologies. For example, in humans, alpha-galactosidase (a-GAL) deficiency, a condition affecting up to 1 in 3000 newborn known as Fabry's disease (FD), causes life threatening damage to heart and kidneys. Since these diseases are usually caused by inherited genomic mutations, they cannot be cured, but they can be treated using Enzyme Replacement Therapies (ERTs), which consist of the injection of a recombinant version of the affected enzymes into patients.
Unfortunately, ERTs have limitations; recombinant enzymes have lower enzymatic activity compared to the human wild-type versions, are unstable in blood, are poorly absorbed by human cells, and often trigger an immune response. Moreover, manufacturing therapeutic enzymes is extremely expensive because standard mammalian cell-based expression systems have low yield.
Developing effective therapeutic enzymes requires design methods able to discover new amino acid sequences that can encode the same catalytic function, while optimising the therapeutic properties of the molecule. Then, these enzymes must be converted into highly optimised DNA triplets, called codons, to maximise expression and yield in host organisms that can grow in inexpensive media. With the increasing incidence of enzymatic deficiencies and current treatments costing up to £400K per year per patient, it is crucial to establish effective methods to perform these tasks and implement a platform for effective and sustainable production of therapeutic enzymes.
Through the EPSRC fellowship, I will develop the computational and experimental methods required for engineering and manufacturing designer enzymes. I will use deep generative machine learning (ML) to design and codon optimise new enzymes, which will then be rapidly built and tested at scale using the lab automation platform available at the University of Edinburgh (UoE). As a proof of concept, I will build a library of designer human a-GAL enzymes using P. pastoris, a high-yield expression system used in the pharmaceutical industry.
To deliver this ambitious project, I have set four objectives over the 4 years of my fellowship :
1. Developing deep generative learning models for enzyme design.
2. Developing deep generative learning models for codon optimisation.
3. Building a library of designer human a-GAL enzymes in P. pastoris.
4. Developing a computer aided design (CAD) software for enzyme engineering.
Each objective addresses current limitations in enzyme engineering and manufacturing. ML avoids the need for accurate biophysical models by learning design rules directly from existing enzymes. Thus, by reverse engineering Nature's design principles, it will be possible to engineer functional designer enzymes at unprecedented scale. Coupling in-silico design with a robotic platform will allow building and testing thousands of different variants, thus minimising the time required for identifying a functional enzyme. Here I will test this new approach by engineering the human a-GAL enzyme, which is currently difficult to manufacture and optimise for therapeutic treatment; this effort will not only provide experimental evidence for the effectiveness of my platform but could also identify new potential treatments for FD.
The project is supported by a strong network of experts in synthetic biology and machine learning, in the UK and the US, industrial biopharmaceutical and biotechnology partners, such as Fujifilm Diosynth Biotechnologies UK (FDBK) and the Industrial Biotechnology Innovation Centre (IBioIC), and unique research facilities available at UoE, such as the Edinburgh Genome Foundry.
With this fellowship, I will lay the foundation for data-driven biological engineering and deliver enabling computational and experimental technologies to rapidly design, build and test new therapeutic molecules.
Unfortunately, ERTs have limitations; recombinant enzymes have lower enzymatic activity compared to the human wild-type versions, are unstable in blood, are poorly absorbed by human cells, and often trigger an immune response. Moreover, manufacturing therapeutic enzymes is extremely expensive because standard mammalian cell-based expression systems have low yield.
Developing effective therapeutic enzymes requires design methods able to discover new amino acid sequences that can encode the same catalytic function, while optimising the therapeutic properties of the molecule. Then, these enzymes must be converted into highly optimised DNA triplets, called codons, to maximise expression and yield in host organisms that can grow in inexpensive media. With the increasing incidence of enzymatic deficiencies and current treatments costing up to £400K per year per patient, it is crucial to establish effective methods to perform these tasks and implement a platform for effective and sustainable production of therapeutic enzymes.
Through the EPSRC fellowship, I will develop the computational and experimental methods required for engineering and manufacturing designer enzymes. I will use deep generative machine learning (ML) to design and codon optimise new enzymes, which will then be rapidly built and tested at scale using the lab automation platform available at the University of Edinburgh (UoE). As a proof of concept, I will build a library of designer human a-GAL enzymes using P. pastoris, a high-yield expression system used in the pharmaceutical industry.
To deliver this ambitious project, I have set four objectives over the 4 years of my fellowship :
1. Developing deep generative learning models for enzyme design.
2. Developing deep generative learning models for codon optimisation.
3. Building a library of designer human a-GAL enzymes in P. pastoris.
4. Developing a computer aided design (CAD) software for enzyme engineering.
Each objective addresses current limitations in enzyme engineering and manufacturing. ML avoids the need for accurate biophysical models by learning design rules directly from existing enzymes. Thus, by reverse engineering Nature's design principles, it will be possible to engineer functional designer enzymes at unprecedented scale. Coupling in-silico design with a robotic platform will allow building and testing thousands of different variants, thus minimising the time required for identifying a functional enzyme. Here I will test this new approach by engineering the human a-GAL enzyme, which is currently difficult to manufacture and optimise for therapeutic treatment; this effort will not only provide experimental evidence for the effectiveness of my platform but could also identify new potential treatments for FD.
The project is supported by a strong network of experts in synthetic biology and machine learning, in the UK and the US, industrial biopharmaceutical and biotechnology partners, such as Fujifilm Diosynth Biotechnologies UK (FDBK) and the Industrial Biotechnology Innovation Centre (IBioIC), and unique research facilities available at UoE, such as the Edinburgh Genome Foundry.
With this fellowship, I will lay the foundation for data-driven biological engineering and deliver enabling computational and experimental technologies to rapidly design, build and test new therapeutic molecules.
Publications

Lauer S
(2023)
Context-dependent neocentromere activity in synthetic yeast chromosome VIII.
in Cell genomics


Stracquadanio G
(2022)
Polymer physics of structural evolution in synthetic yeast chromosomes

Vegh P
(2024)
Biofoundry-Scale DNA Assembly Validation Using Cost-Effective High-Throughput Long-Read Sequencing
in ACS Synthetic Biology

Title | GREET: Generative Recombinant Enzyme Engineering For Therapeutics |
Description | The animation provides an accessible introduction to lysosomal storage diseases (LSDs) and Fabry disease, and shows how we are using AI to engineer better therapies. Since LSDs affect mostly children and young adults, we thought that publishing a video on YouTube would have been the best way to share our work. |
Type Of Art | Film/Video/Animation |
Year Produced | 2023 |
Impact | The video has been shared with the Edinburgh Kidney Research group, which includes Fabry patients, and has been featured in the newsletter and website of the School of Biological Sciences, Center for Engineering Biology and College of Science and Engineering at UoE. |
URL | https://youtu.be/IgsoMU1aN-c |
Description | After 2 of the 4 years of funding, the GREET project has already led to XXX significant achievements: 1. the development of a new protein engineering generative model that is order of magnitude more efficient that the state-of-the-art, while maintaining or improving state-of-the-art performances on a well-established benchmark dataset. 2. the development of highly efficient manufacturing and screening techniques for production of enzyme replacement therapies in P. pastoris, which might lead in the future to more economically sustainable therapies. 3. the development of high-quality software workflows for protein engineering and characterisation, which allows to design and prioritise enzymes for experimental testing. |
Exploitation Route | The outcome of this project could be exploited by the biotech industry to design and manufacture recombinant proteins beyond enzyme replacement therapies. |
Sectors | Healthcare Pharmaceuticals and Medical Biotechnology |
URL | http://www.greet-project.org |
Description | Research work carried out as part of the GREET project is leading to the creation of a drug discovery company specialised in enzyme replacement therapies. Computational and experimental methods developed during the project represent the foundational IP of the company, which is currently supported in its operation by the Scottish Enterprise High Growth Opportunity Qualification grant (2023). The company is currently finalising its market analysis and business model, and we expect to spin out in 2025. Research work carried out as part of the GREET project has also led to new collaborations with the Inherited Metabolic Disorders unit for NHS Scotland, specifically with Dr Eve Miller-Hodges, and the Edinburgh Kidney Research network. This new collaboration has led to the joint IDERT project, which was funded in 2023. |
First Year Of Impact | 2023 |
Sector | Pharmaceuticals and Medical Biotechnology |
Impact Types | Economic |
Description | 21EBTA Engineering Biology for Cell and Gene Therapy Applications |
Amount | £1,518,259 (GBP) |
Funding ID | BB/W014610/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2022 |
End | 01/2024 |
Description | IDERT: Intelligent Deimmunization for Enzyme Replacement Therapies |
Amount | £616,358 (GBP) |
Funding ID | EP/Y01913X/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2023 |
End | 04/2025 |
Title | Transcriptome-wide meta-analysis of codon usage in Escherichia coli |
Description | Data generated by the CUBseq pipeline on Escherichia coli RNA-seq data. |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
URL | https://zenodo.org/record/8305119 |
Description | Edinburgh Kidney research initiative |
Organisation | University of Edinburgh |
Department | Renal Medicine Edinburgh |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | I have been invited to joined a network of researchers and clinicians at University of Edinburgh, who work on renal diseases which include also Fabry disease. My role here is to promote the use of AI and engineering biology in drug discovery and to engage with patients to make them aware on the progress enabled by these technologies. The collaboration is very productive and we are working on joint UKRI proposal. |
Collaborator Contribution | Partner provide expertise into the clinical implications of my fellowship work, and has allowed me to bridge my research with patients in the clinic. |
Impact | The collaboration is interdisciplinary since it involves work with clinicians. |
Start Year | 2023 |
Title | PROTON: PROtein engineering by TempOral convolutional Networks |
Description | The PROtein engineering by TempOral convolutional Networks (PROTON) is a deep learning software to design protein libraries using sequence information of protein families. PROTON it implements a generative model, called Temporal Dirichlet Variational Auto Encoder (TDVAE), which maps a protein family design space into a discrete mathematical space and uses temporal convolution to output new, unseen protein sequences. The software offers to design options: prior sampling design, which generates sequences using information learned by the entire protein family, or posterior sampling design, which generates variants of a user- defined protein. PROTON can performs biochemical characterisation of the designed sequences, and can rank and prioritise sequences for downstream experimental testing using two new analyses, namely coverage and confidence analysis: the former estimates the amount of data supporting the predicted amino acid, the latter estimates how confident the model is about its prediction. PROTON can also optimise the training process by performing sequence clustering, and similarly create highly diverse protein libraries by using sequence clustering methods like MMseq2, as already shown in our preprint: this step is completely optional or can be replaced by any other clustering software. PROTON is designed to work in high-performance computing environments and exploits parallelism to minimise the computational burden. PROTON is licenses through TTO at University of Edinburgh under the new technology disclosure "TEC1104509 - PROTON: PROtein engineering by TempOral convolutional Networks". |
Type Of Technology | New/Improved Technique/Technology |
Year Produced | 2023 |
Impact | PROTON is enabling the |
Description | Patients' Engagement, QMRI, University of Edinburgh. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Patients, carers and/or patient groups |
Results and Impact | 40 people attended the Edinburgh Kidney PPI event at QMRI, which focused on showing the using of AI in drug discovery for rare diseases. The activity contributed to shift the widespread negative opinion the patients had about AI, by explaining that AI is an assistive tool to help scientists to rapidly identify new potential treatments. |
Year(s) Of Engagement Activity | 2023 |
URL | https://edinburghkidney.co.uk |
Description | Patients' Engagement, School of Biological Sciences, University of Edinburgh. |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | I organised a lab visit for a family of Fabry patients, who got in touch with me to know more about my research work. The visit was organised as follows: 1. I delivered a talk describing the work of my research group and what are our long term goals, and how we use AI and engineering biology to achieve them. 2. Q&A sessions to gather patients' feedback and views. 3. Visit of my laboratory and the Edinburgh Genome Foundry (EGF). |
Year(s) Of Engagement Activity | 2024 |