Improving the penicillin fermentation by modelling and optimising its metabolic network and transporterome

Lead Research Organisation: University of Liverpool
Department Name: Institute of Integrative Biology

Abstract

Penicillin was famously (re)discovered by Alexander Fleming in 1928, and developed into an initial fermentation process by Florey, Chain and Heatley during the early 1940s. Despite increasing antimicrobial resistance, it remains one of the UK fermentation industry's top products (often combined with the penicillinase inhibitor clavulanic acid in products such as Augmentin). However, in terms of the efficiency of conversion of carbon in the sugar feedstock to carbon in the penicillin product it is a lousy fermentation as this efficiency is no more than 10%. Many other fermentations, such as that producing monosodium glutamate, have a carbon conversion efficiency approaching 100%. Consequently, there is much room for improvement, and for making the penicillin process more competitive economically. As with the design of engineering artefacts such as the Boeing 777, what is needed is a mathematical model of the metabolic network of the penicillin producer, P. chrysogenum. The amount that each gene is expressed tells us what is going on, and is known as the transcriptome. To this end, transcriptome data FROM THE PRODUCTION STRAIN ITSELF will be made available by GSK. From the (known) genome sequence we can produce a computer version of the metabolic network for analysis. Importantly, the media used for the penicillin process are fully defined, which makes it feasible to do this modelling.

Having produced the model, we can predict, initially qualitatively, what molecules it will produce, and these will be measured experimentally on extracts of cells and medium provided by GSK. The combination of the model and the transcriptome allows us to calculate all the fluxes, both to product and to non-profitable places. This will help determine which changes in the genetic make-up of the Penicillium fungi are most likely to lead to a higher carbon conversion efficiency.

These changes will be made (by GSK) and tested on the new production strains. The ability to do these analyses on cell extracts and inside a computer means that while we have access to the data we neither have nor need access to the proprietary production strains themselves.

Importantly, we shall curate all of the data in a suitable database.

Technical Summary

Penicillin was famously (re) discovered by Alexander Fleming in 1928, and developed into an initial fermentation process by Florey, Chain and Heatley during the early 1940s. It remains one of the UK fermentation industry's top products. However, in terms of the efficiency of conversion of carbon in the sugar feedstock to carbon in the penicillin product (<10%) it is a lousy fermentation. Many others such as that producing monosodium glutamate, have a carbon conversion efficiency approaching 100%. Consequently, there is much room for improvement, and for making the penicillin process more competitive economically. As with the design of engineering artefacts such as the Boeing 777, what is needed is a mathematical model of the metabolic network of the penicillin producer, P. chrysogenum. The genome sequence is available, and the genes in the production strain are the same; they just differ in expression. To this end, transcriptome data FROM THE PRODUCTION STRAIN ITSELF will be made available by GSK. From this we shall produce a computer version of the metabolic network for analysis. Importantly, the media used for the penicillin process are fully defined, which makes it feasible to do this modelling.

From the model, we can predict, initially qualitatively, what molecules it will produce, and these will be measured experimentally on extracts of cells and medium provided by GSK. The combination of the model and the transcriptome allows us to calculate all the fluxes, both to product and to non-profitable places. This will determine which changes in the genetic make-up of the Penicillium fungi are most likely to lead to a higher carbon conversion efficiency.

These changes will be made (by GSK) and tested on the new production strains. The use of cell extracts and modelling means that while we have access to the data we do not need access to the proprietary production strains themselves.

Importantly, we shall curate all of the data in a suitable database.

Planned Impact

WHO WILL BENEFIT: The collaborating company will benefit in a number of ways, by (i) gaining access to a full metabolic network model of a producer organism and the predicted fluxes through every node, (ii) understanding which transporters are involved in the main fluxes of substrates and products, (iii) eventually gaining significant economic leverage via improved product titres and carbon conversion efficiencies.

So far as industrial biotechnology more generally is concerned, companies will benefit from knowledge of the benefits of our approaches for predictive metabolic engineering.

HOW WILL THEY BENEFIT: As is our practice, all pertinent data are made available via the Web, and OA publishing has long been our norm. We also hold frequent workshops in Manchester to assist dissemination of research results. We have pioneered in the Altmetrics field for digital dissemination: indeed, in a recent Nature article (Altmetrics make their mark. Nature 2013; 500:491-492) Kwok highlighted the fact that the PI's paper Hull D, Pettifer SR, Kell DB: Defrosting the digital library: bibliographic tools for the next generation web. PLoS Comput Biol 2008; 4:e1000204, was the most accessed ever in any PLoS journal, with over 53,000 accesses (it is well past 100,000 now). We shall work closely with University KT staff and industrial IP offices (UMIP in Manchester) to agree a mutually beneficial contract as part of this project. Finally, having secured IP, we shall, of course, seek actively to communicate our scientific findings to the wider research community through scientific meetings, scholarly publications and press releases.

THE WIDER COMMUNITY: DBK is also a well known blogger and tweeter, and social media will provide a novel and useful means of disseminating our findings.

COMMUNICATIONS: We will communicate with relevant industrial partners both directly and via the meetings of relevant learned societies (we are members of several). In year three of the Project, we will organise a half-day meeting to explain our research to interested industrial scientists. However, we will also provide a video link to facilitate the participation of those who are unable to travel to Manchester.

Publications

10 25 50
 
Description Code-free approach for modeling whole genome metabolic networks was developed. Ten gene candidates to improve the target production was identified and sent to GSK team for experimental evaluation.
Flux-based analysis (FBA) and related techniques are among of the most used modelling techniques to deal with whole-genome metabolic networks. These techniques allow to identify bottlenecks in production of target compounds by microorganisms in the biotechnology. One of the key obstacles in wide use of modelling techniques in biotechnology is that their application requires coding skills in environments like Python or Matlab. We have developed a set of nodes for the scientific workflow execution platform KNIME that allow code-free use of FBA and combination of its results with experimental data. The utility of the approach was demonstrated by combining metabolomics and transcriptomics data with FBA for identification of a set of candidate genes to improve target compound. Ten of the best candidates are at the experimental evaluation in GSK team.
Exploitation Route Code-free metabolic network analysis and simulation of the target compound productions in the KNIME make it possible to introduce in silico experiment as a routine step in biotechnology as an element of the design of experiment, data analysis and strain/media/process optimisation.
Sectors Pharmaceuticals and Medical Biotechnology

URL https://github.com/lptolik/KNIME_FBA
 
Description The KNIME environment and FBA components for the simulation of the whole-genome metabolic network behaviour was installed in the industrial partner's site GSK. The set of ten best candidates for target compound production is under experimental evaluation there as well. Recent progress in the speed and availability of sequencing cause exponential growth of number of species and strains for which whole genome sequence is known. This information allows scientists to identify physiological and metabolic abilities of a particular organism, which could help in development of new biotechnological super-producing strains, or in understanding ways in which those organisms interact with humans or animals. The most widely used technique to model microorganism physiology by whole-genome sequence is flux-based analysis (FBA). We have developed the code-free approach to use FBA in in silico experiments within the scientific workflow execution platform KNIME. That allows scientists inexperienced in any programming language to combine multiple -omics data, such as genomics (in a form of whole-genome metabolic network), transcriptomics (activity of individual genes/enzymes) and metabolomics (concentration of particular compounds in the media or within the cell) with advanced modelling techniques such as flux variability analysis (FVA). This would make in silico experiment routine for microbiology and biotechnology
First Year Of Impact 2021
Sector Pharmaceuticals and Medical Biotechnology
Impact Types Societal