Combined /omics approaches to understand and control library enriched microbial cell factories

Lead Research Organisation: University of Sheffield

Department Name: Chemical & Biological Engineering

Abstract

The dynamic biological behaviour understanding needed for bioprocess development cannot be predicted solely by individual level /omic studies, since this approach only tells a proportion of the story. Therefore, we will implement an analytical technique, based on several different plasmid based genomic libraries (from two bacteria, Escherichia coli and Campylobacter jejuni) expressed in E. coli, coupled with measurements at the microarray (messenger RNA level) and proteome (protein complement of the cell) scale, to understand and improve the secreted glycosylated protein production bioprocess. Production of these types of proteins is very important to the pharmaceuticals industry, since nearly three quarters of proteins with human therapeutic importance are glycosylated (either released or in clinical and preclinical development). From the simplest to the most complex organisms, the process of transferring information from the genome to make proteins is universal and central to life. Understanding and quantifying this process is essential for scientific advancement. With such information it will become possible to manipulate organisms to achieve a desired biotechnological goal, such as production of proteins for medicinal purposes, and the replacement of synthetic chemicals. A genome sequence is the code for programming the way an organism functions. Sequencing the genome provides a database of information for identifying genes and assigning the potential function of these genes, and allows for comparison of similar genes across species. When genes switch on to start a biological function, a message is generated, that eventually makes a protein. Experimental technologies that exploit this message information across thousand of genes have been developed, such as SCALEs (multi-Scale Analysis of Library Enrichment). This field of information is known as transcriptomics. Transcriptomics, however, cannot be used solely to predict the dynamic biological behaviour needed for future biotechnology development, since this approach only tells a proportion of the story. The information missing from SCALEs is how these gene messages are used. What is needed is an integrated study of the message from the genome with the production of proteins. In order to achieve this, we also will implement an analytical technique, similar to SCALEs that will concentrate on the proteins rather than the genome. It is important to examine the protein complement of the organism (known as the proteome), because the observed physical health and behaviour of an organism is determined by the interaction of its genome with the environment, and this interaction is directly due to the proteins, rather than the genome, and its subsequent message (the transcriptome). Using our technique (called MLPPTM) which studies proteins, and experimental techniques such as SCALEs, which study the message from the genome, we will be able to provide an integrated study which generates a deeper knowledge of which proteins help give a cell certain properties. In this case, we seek to understand which proteins will give a cell an enhanced ability to generate glycosylated proteins (those with a linked oligosaccharide). This is important because the majority of proteins applied towards human heath applications are glycoproteins. Bacteria have not generally been thought of as being able to produce these proteins, and so more complicated organisms (eg from mammals), have been used instead. Bacteria are simpler to understand, grow faster and cheaper, and so would be very attractive if they could be designed to produce glycosylated proteins properly. The integrated transcriptomic and proteomic techniques examining E.coli containing overexpression libraries to be implemented here will allow us, when successful, to improve on glycosylated protein production in a bacterium, and set the scene for future efficient bioprocesses for making therapeutic proteins.

Technical Summary

This project aims to apply a genome-wide, multiscale approach for functional genomics to improve the production of recombinant proteins in Escherichia coli, and to take this approach further to begin to understand how to improve the production of glycosylated proteins. We will integrate data obtained from DNA microarray inverse metabolic engineering tools such as SCALEs (multi-Scale Analysis of Library Enrichment), with that obtained from high throughput quantitative shotgun proteomics (building on 8-plex isobaric mass tag technology - iTRAQ) methods as an addition, as proteomics is a level closer to the functional understanding of a phenotype. We will analysis the data using a multivariate approach. We then will seek to move beyond simple statement of whether the transcriptomic and proteomic data are concordant or discordant, but rather how these then can be interpreted in the context of biological pathways. In particular those related to recombinant protein synthesis of the model glycoprotein. Implementation of /omic based tools and the resulting data is necessary to provide a systems level understanding of an organism so that a deeper functional understanding results in bioprocess engineers being able to take advantage of findings in the biosciences, and translate these to valuable processes and products for UK bioprocessing businesses. We seek to ultimately improve the production of glycosylated recombinant proteins such as the N-glycoprotein AcrA, in E. coli here as an exemplar project. This protein has been demonstrated as being possible to produce in E. coli, following the transfer of the N-glycosylation system from Campylobacter jejuni into E.coli cells.

Funded Value:

£297,941

Funded Period:

Mar 08 - Mar 11

Funder:

BBSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

BB/F004842/1

Principal Investigator:

Phillip Craig Wright

Research Subject:

Bioengineering (51%)

Omic sciences & technologies (32%)

Research Topic:

Biochemical engineering (17%)

Metabolic engineering (17%)

Protein engineering (17%)

Proteomics (16%)

Transcriptomics (16%)

Organisations

People	ORCID iD
Phillip Craig Wright (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Jaffé SR (2014) Escherichia coli as a glycoprotein production host: recent developments and challenges. in Current opinion in biotechnology

Jaffé SR (2015) Inverse Metabolic Engineering for Enhanced Glycoprotein Production in Escherichia coli. in Methods in molecular biology (Clifton, N.J.)

Pandhal J (2010) N-Linked glycoengineering for human therapeutic proteins in bacteria. in Biotechnology letters

Pandhal J (2011) Improving N-glycosylation efficiency in Escherichia coli using shotgun proteomics, metabolic network analysis, and selective reaction monitoring. in Biotechnology and bioengineering

Pandhal J (2012) Systematic metabolic engineering for improvement of glycosylation efficiency in Escherichia coli. in Biochemical and biophysical research communications

Pandhal J (2013) Inverse metabolic engineering to improve Escherichia coli as an N-glycosylation host. in Biotechnology and bioengineering

Parker JL (2014) Maf-dependent bacterial flagellin glycosylation occurs before chaperone binding and flagellar T3SS export. in Molecular microbiology

Phansopa C (2014) Structural and functional characterization of NanU, a novel high-affinity sialic acid-inducible binding protein of oral and gut-dwelling Bacteroidetes species. in The Biochemical journal

Strutton B (2017) Generation of Recombinant N-Linked Glycoproteins in E. coli. in Methods in molecular biology (Clifton, N.J.)

Key Findings
Further Funding


Description	We have determined factors that are involved in making specialist therapeutic proteins (drugs) in E.coli. Were have used a range of different analytical tools to determine important aspects of cell metabolism that can be harnessed to improve the amount of a target protein and the form of this protein. This will be useful for the biopharmaceuticals sector.
Exploitation Route	In general the tools can be used to optimise cellular metabolism to make protein products.
Sectors	Chemicals,Healthcare,Manufacturing, including Industrial Biotechology


Description	Advanced Life Science Research Technology Initiative
Amount	£406,531 (GBP)
Funding ID	BB/M012166/1
Organisation	Biotechnology and Biological Sciences Research Council (BBSRC)
Sector	Public
Country	United Kingdom
Start


Description	BBSRC / EPSRC / Innovate UK IB Catalyst Round 1 Early Stage Translation
Amount	£301,286 (GBP)
Funding ID	BB/M018288/1
Organisation	Biotechnology and Biological Sciences Research Council (BBSRC)
Sector	Public
Country	United Kingdom
Start	01/2015
End	12/2018


Description	BBSRC BRIC2
Amount	£296,457 (GBP)
Funding ID	BB/K011200/1
Organisation	Biotechnology and Biological Sciences Research Council (BBSRC)
Sector	Public
Country	United Kingdom
Start	11/2013
End	10/2016


Description	Synthetic Biology IKC
Amount	£4,990,071 (GBP)
Funding ID	EP/L011573/1
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications