codABLE: Data-driven optimised protein production using Pichia pastoris

Lead Participant: INGENZA LIMITED

Abstract

Our project addresses the critical need for viable, sustainable biomanufacturing processes and accelerated development of protein therapeutics, emphasising the importance of predictability when designing DNA sequences to produce valuable heterologous proteins efficiently, using recombinant organisms. Leveraging insights from natural codon usage profiles, we developed codABLE, a machine learning-based platform to customise genetic codon composition to be most compatible with the production host. CodABLE outperforms codon optimisation algorithms operated by commercial DNA synthesis providers and offers a unique advantage in controlling protein expression without altering regulatory regions such as promoters or ribosome binding sequences.

Initially applied to _Bacillus subtilis_, codABLE demonstrated repeated success in predicting and enhancing protein expression. Now, we aim to extend this algorithm to _Pichia pastoris_, a more complex eukaryotic organism with particular advantages for use in protein biomanufacturing.

Expression data from a large and diverse library of gene variants will be collected by fluorescence activated cell sorting (FACS) and Next Generation Sequencing, to establish a genotype-phenotype relationship. These valuable data will be fed into our machine learning platform, incorporating algorithms like Support Vector Machine and Random Forest, to discern key relationships between codon usage and protein expression.

The best-performing algorithm will be used to design DNA sequences to express protein targets of commercial value, serving as both model validation and a compelling solution for those seeking innovative protein expression strategies. Our approach combines cutting-edge computational technology with Ingenza's expertise using diverse microbial hosts and ultra high-throughput FACS screening to increase our business competitiveness and contribute to Scotland's bio-based manufacturing innovation.

Lead Participant

Project Cost

Grant Offer

INGENZA LIMITED £126,595 £ 99,883

Publications

10 25 50