De novo sequencing of the Chinese Hamster Ovary (CHO) cell genome

Lead Research Organisation: University of Sheffield
Department Name: Chemical & Biological Engineering

Abstract

The engineering paradigm of measure, model, manipulate and manufacture underpins the design of products, processes and structures with reliable, predictable performance. The design process requires a detailed knowledge of what the interacting components are, how they interact and the forces (rules) that govern those interactions. This is why it was possible to send a man to the moon in 1969 (i.e. to predict functional performance based on known physical interactions) but not to cure cancer (unpredictability deriving from complex, unknown components and interactions). Accordingly, as we enter a new age of biological engineering, the extent to which it will be possible to engineer complex biological systems for human benefit will ultimately depend upon the extent of our knowledge of those systems - the rules that govern how the complex biological system functions - or malfunctions in the case of disease. To engineer any biological system effectively we need a basic blueprint - knowledge (or design principles) that helps us to understand specifically how that organism is functionally equipped. For biological engineers this primary information is an organism's complete DNA sequence (it's genome). For simple organisms such as bacteria the genome is relatively simple - only about 6000 genes (functional genetic units) in Escherichia coli for example. In human cells there are over 30,000 genes and a large amount of 'non-coding' DNA involved in regulation of these genes. Using microbial genome sequence information, bioengineers can for the first time truly engage in the engineering design process. New ways of measuring and modelling the complexity of simple bacterial systems have emerged (this is 'systems biology') which enables us to (genetically) manipulate cells and manufacture novel products and processes using new tools (this is 'synthetic biology'). Importantly, bioengineers can now predict the functional capability of simple bacteria growing in vitro using computer models. Similar approaches are now being developed for inherently more complex mammalian cells. This project is designed to provide a much needed genomic resource for academic and industrial bioscientists and bioengineers in the UK concerned with the production of a new generation of recombinant DNA derived medicines made by made by genetically engineered cells in culture - biopharmaceuticals. Biopharmaceuticals are proving to be revolutionary treatments for many serious diseases such as rheumatoid arthritis and a range of cancers. We want to determine the genome sequence of an extremely important type of 'cell factory' that is used to make these bio-medicines; the Chinese hamster ovary (CHO) cell. Most (60-70%) biophamaceuticals are currently made by genetically engineered CHO cells in culture as well as the vast majority of those in development. However, despite the huge industrial and scientific importance of this cell type, we still do not have the CHO cell's genome sequence: The fundamental informatic resource necessary to utilise new systems and synthetic biology tools to understand and engineer the function of this cell factory. To address this problem we have formed a consortium of the UK's leading academic groups involved in research into CHO cell based manufacturing systems based at the Universities of Kent, Manchester and Sheffield, and four key industrial partners involved in biopharmaceutical manufacturing in the UK. In this project we will utilise the most advanced DNA sequencing technology available to rapidly sequence, assemble and annotate the CHO cell genome. We will establish a network to disseminate this information and to determine how we might most effectively harness this resource for future engineering strategies to improve CHO-cell based production processes. This project is necessary for, and will lead to, cutting-edge applied research underpinning new biopharmaceutical manufacturing technology.

Technical Summary

We have formed a BRIC-based consortium of three leading academic groups and four bio-industrial companies concerned with biopharmaceutical production by mammalian cells in culture to sequence an organism of immense industrial importance, the Chinese hamster. In this project we will (i) utilise the most advanced technology available to sequence, assemble and annotate the Chinese hamster genome and (ii) create a BRIC-enabled network to discuss, disseminate and design new research that will utilise this important resource. This project will enable UK based scientists to compete globally with other groups (notably in the USA, Singapore and Europe) who are developing informatic resources for CHO cell based bioprocess development. We will outsource CHO genomic analysis to an established commercial service provider, Source Bioscience, who have considerable experience of large scale DNA sequencing contracts. For this project, Source Bioscience will provide access to the most rapid DNA sequencing platform available: the Illumina HiSeq 2000. The genome of a single Chinese hamster (approximately 3 x 109 bp) will be sequenced to a depth of 50x, sequencing with the paired-end approach at 100bp read length. Bioinformatic assembly of the CHO genome will utilise a SIMD-accelerated assembly algorithm to assemble contiguous sequences, followed by in silico annotation based on BLAST searches. Assembled data will be returned in the GenBank format. We intend to employ a postdoctoral bioinformatician for six months to (i) liaise with the contract service provider (ii) prepare CHO genomic information for dissemination within BRIC and (iii) organise networking events involving BRIC partners. This project is essential to maintain the long-term competitiveness of UK research in CHO cell based bioprocess development.

Planned Impact

The major impacts of this research nationally and internationally, both at the academic and industrial levels, will be on the following: (1) those in the BRIC bioprocessing/scientific community with an interest in the use of Chinese hamster ovary cell expression systems and products derived using this expression system (acroos the whole process), (2) researchers generally in the field of bioprocessing (3) those using Chinese hamster cell lines as model systems (4) bioinformaticians and genome analysis researchers The research will also impact on the wider research agenda that will ultimately benefit, academics, industrialists, the patient and ultimately the UK economy via the development of new methodology to provide recombinant protein based 'bio-drugs'. To ensure that the research findings that may impact on manufacturing, human health and public knowledge are distributed effectively we will work closely with our industrial colleagues in the BRIC network. Generation of this data set will also place members of BRIC and the UK bioprocessing sector in a much stronger position to maintain its current powerful position in relation to the biopharmaceuticals industry.

Publications

10 25 50
 
Description This project provided a much needed resource for academic and industrial bioscientists and bioengineers in the UK concerned with the production of a new generation of recombinant DNA derived medicines made by made by genetically engineered cells in bioreactors - biopharmaceuticals. Biopharmaceuticals are proving to be revolutionary treatments for many serious diseases such as rheumatoid arthritis and a range of cancers. Most new biopharmaceuticals are made using a particular type of mammalian cell "factory", the Chinese Hamster Ovary (CHO) cell. This cell works well because it can, unlike simple bacteria, make complex human proteins.
However, despite the huge industrial and scientific importance of the CHO cell, until last year we still did not have it's basic genetic blueprint - the sequence of its "genome" (all it's DNA code). This is important because we need this to help us work out how to understand and engineer the function of this cell factory.
To address this problem we formed a consortium of the UK's leading academic groups involved in research into CHO cell based manufacturing systems based at the Universities of Kent, Manchester and Sheffield, and four key industrial partners involved in biopharmaceutical manufacturing in the UK. We utilised the most advanced DNA sequencing technology available to rapidly sequence, assemble and annotate the CHO cell genome. We also established a network to disseminate this information and to determine how we might most effectively harness this resource for future engineering strategies to improve CHO-cell based production processes.
We have succeeded in creating an internet-based resource that now allows University researchers and companies to access CHO cell DNA sequence data for all kinds of routine applications and we are in the process of showing people how to use it effectively through training days and information dissemination events. We believe that this resource will accelerate our ability to perform cutting-edge applied research underpinning new biopharmaceutical manufacturing technology. We already have examples of new collaborative projects between Universities and industry in the UK that will utilise this information.
Finally, given the importance of CHO cells, of course we are not the only group of researchers interested in sequencing the CHO cell genome. Throughout the world there are other groups rapidly developing similar resources. We are now able to collaborate with these groups, and by making our data publically available we will join an international consortium that at CHOgenome.org to maintain and extend the best possible bioinformatic tools to support biopharmaceutical manufacture by CHO cells.
Exploitation Route Results Exploitation and Knowledge Transfer
Data/sequences lodged in public access databases
Raw sequences for both the genomic and RNA-Seq data will were submitted to the Sequence Read Archive (SRA) repository at the European Nucleotide Archive (www.ebi.ac.uk/ena/about/sra_submissions). For assembled sequence data an FTP directory was made available initially from TGAC servers, followed by a final submission by TGAC to the NCBI repository for genomes (www.ncbi.nlm.nih.gov/sites/entrez?db=genome).

Additional activities, completed during July 2012 included:
i. The addition of further CHO genome browser functions (such as a gene ontology search engine) to the TGAC web-resource.
ii. Transcriptome analysis (a quantitative analysis of annotated RNAs) by RNA-sequencing using RNA prepared from the same CHO cell line
iii. Provision of a training workshop for BRIC company and academic members ( July 24th, 2012) to cover sequence assembly and annotation, expression analysis by RNA-Seq and a hands on session for users own data analysis. All 24 available places were booked.
iv. Attendance at the BRIC dissemination event in October 2012 by senior TGAC personnel to demonstrate CHO genome resources.

4. Industrial collaborations
During the enabling grant period CHO genome networking meetings have contributed to both the design of new industrial CASE studentship projects and new project grant applications submitted to BRIC. For example,

DCJ Laboratory (Sheffield)
i. Three current industrial CASE projects started in 2011.
ii. Two new industrial case projects to start in 2012.
iii. Two BBSRC BRIC 2/2 project grant applications.
nb This resource has substantially contributed to an ongoing BBSRC BRIC grant in DCJs lab (BB/K011197/1; Linking recombinant gene sequence to protein product manufacturability using CHO cell genomic resources), as well as numerous ongoing RCUK and direct industry funded studentships and research projects)

AJD Laboratory (Manchester)
i. A BBSRC BRIC 2/2 project grant application.
ii. A BBSRC project grant application currently under review.
iii. Three new industrial CASE projects to start in 2012.
iv. Two industrial CASE applications in progress.

CMS Laboratory (Kent)
i. Two new BBSRC industrial CASE studentships started September 2011.
ii. A BBSRC BRIC studentship to start September 2012.
iii. Underpins a BBSRC sLoLa application currently awaiting final decision.
iv. An EU application in collaboration with academics and industrialists across Europe currently under consideration.
Sectors Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description 1. Provision of a new bioinformatic resource to the BBSRC-BRIC community and mechanisms (web browser, training opportunities) to access and exploit it. 2. New industry-academic partnerships potentiated by the availability of CHO genomic resources, e.g. new BBSRC project grant applications and industrial CASE studentships. 3. Ability to partner leading groups overseas involved in the development of CHO genomic tools and resources for the benefit of UK industry.
First Year Of Impact 2012
Sector Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
 
Description Direct funding from Biogen Idec
Amount £250,000 (GBP)
Organisation Biogen Idec 
Sector Private
Country United States
Start 07/2015 
End 07/2017
 
Description Direct funding from Lonza
Amount £217,000 (GBP)
Organisation Lonza Group 
Department Lonza Biologics
Sector Private
Country United States
Start 03/2017 
End 03/2019
 
Description Direct funding from MedImmune
Amount £1,100,000 (GBP)
Organisation AstraZeneca 
Department MedImmune
Sector Private
Country United Kingdom
Start 06/2015 
End 06/2020
 
Title CHO Genome Browser 
Description This BBSRC BRIC Enabling Grant was designed to create a specific informatic resource for the BRIC community, a CHO cell genomic sequence. This work was sub-contracted out to The Genome Analysis Centre (TGAC) in Norwich (www.tgac.ac.uk) who were responsible for genome sequencing and bioinformatic assembly and annotation operations. This service provider was specified by the BBSRC. CHO cell (ECACC CHO-K1, Cat. No. 85051005) DNA samples were provided to TGAC on 19/01/11 and 14/06/11. DNA was prepared from cells cultured at MedImmune (Cambridge) from in-house stocks of this cell line. The cell line was made available to BRIC industrial and academic members on demand. 
Type Of Material Database/Collection of data 
Year Produced 2011 
Provided To Others? Yes  
Impact Industrial collaborations During the enabling grant period CHO genome networking meetings have contributed to both the design of new industrial CASE studentship projects and new project grant applications submitted to BRIC. For example, DCJ Laboratory (Sheffield) i. Three current industrial CASE projects started in 2011. ii. Two new industrial case projects to start in 2012. iii. Two BBSRC BRIC 2/2 project grant applications. AJD Laboratory (Manchester) i. A BBSRC BRIC 2/2 project grant application. ii. A BBSRC project grant application currently under review. iii. Three new industrial CASE projects to start in 2012. iv. Two industrial CASE applications in progress. CMS Laboratory (Kent) i. Two new BBSRC industrial CASE studentships started September 2011. ii. A BBSRC BRIC studentship to start September 2012. iii. Underpins a BBSRC sLoLa application currently awaiting final decision. iv. An EU application in collaboration with academics and industrialists across Europe currently under consideration. 
URL http://projects.tgac.bbsrc.ac.uk/CHO/
 
Description Strategic partnership with Biogen 
Organisation Biogen Idec
Country United States 
Sector Private 
PI Contribution CHO cell engineering technology
Collaborator Contribution Project management, research materials, datasets
Impact Johari Y, Estes S, Alves C, James DC. (2015) Integrated cell and process engineering strategies for improved production of a difficult-to-express fusion protein by CHO cells. Biotechnology and Bioengineering. In press.
Start Year 2010
 
Description Strategic partnership with Lonza Biologics 
Organisation Lonza Group
Department Lonza Biologics
Country United States 
Sector Private 
PI Contribution Genetic vector and cell engineering technology development
Collaborator Contribution Project management, laboratory facilities, research materials.
Impact Grainger RG, James DC (2013). Cell line specific control and prediction of recombinant monoclonal antibody glycosylation. Biotechnology and Bioengineering. 110: 2970-2983. Davies SL, Lovelady CS, Grainger RK, Racher AJ, Young RJ, James DC. (2013) Functional heterogeneity and heritability in CHO cell populations. Biotechnology and Bioengineering 110: 260-274. Highlighted "Spotlight" paper. McLeod J, O'Callaghan PM, Pybus LP, Wilkinson SJ, Root T, Racher AJ, James DC (2011) An empirical modeling platform to evaluate the relative control discrete CHO cell synthetic processes exert over recombinant monoclonal antibody production process titer. Biotechnology and Bioengineering. 108: 2193-2204. Davies SL, McLeod J, O'Callaghan PM, Pybus LP, Sung YH, Wilkinson SJ, Rance J, Racher AJ, Young RJ, James DC. (2011) Impact of gene vector design on the control of recombinant monoclonal antibody production by CHO cells. Biotechnology Progress 27: 1689-1699. O'Callaghan PM, MacLeod J, Pybus L, Lovelady CS, Wilkinson S, Racher AJ, Porter A, James DC. (2010) Cell line specific control of recombinant monoclonal antibody production by CHO cells. Biotechnology and Bioengineering. 106: 937-951.
Start Year 2006
 
Description Strategic partnership with MedImmune 
Organisation AstraZeneca
Department MedImmune
Country United Kingdom 
Sector Private 
PI Contribution Development of novel cell engineering technology
Collaborator Contribution Project management, laboratory facilities, research reagents and model systems
Impact Pybus LP, Dean G, Slidel T, Hardman C, Smith A, Daramola O, Field R, James DC (2014) Predicting the expression of recombinant monoclonal antibodies in Chinese hamster ovary cells based on sequence features of the CDR3 domain. Biotechnology Progress 30: 188-197. Pybus LP, Dean G, West NR, Smith A, Daramola O, Field R, Wilkinson SJ, James DC (2014) Model-directed engineering of "difficult-to-express" monoclonal antibody production by Chinese hamster ovary cells. Biotechnology and Bioengineering 111: 372-385. Highlighted "Spotlight" paper. Thompson BC, Segarra CRJ, Mozley O, Daramola O, Field R, Levison PL, James DC. (2012) Cell line specific control of PEI-mediated transient transfection optimised with 'Design of Experiments' methodology. Biotechnology Progress. 28: 179-187.
Start Year 2007