The iBAC genomic DNA expression library

Lead Research Organisation: University of Oxford
Department Name: Wellcome Trust Centre for Human Genetics

Abstract

The recent completion of projects to obtain the complete DNA sequence of all human genes, known as the human genome, has opened up a new era of biology. We now are able to propose a whole series of experiments which would previously not have been possible. The DNA sequence of our genes confirmed that the vast majority of our DNA sequence does not provide information to make proteins, but contains regulatory information to turn genes on and off, or to turn protein production up and down, in the right cells at the right time in development. This DNA sequence, called non-coding DNA, has previously been difficult to define and hard to study. For many years it has been possible to take a gene sequence out of a chromosome and use it to make, or 'express', a protein in cultured cells grown in the laboratory. However, it has not been easy before to take a gene from a chromosome together with all of its regulatory elements. Inside cells the control of gene expression is complex and if genes are used without their natural regulatory elements it is difficult to control when they make protein and how much they make. This project will make a collection of all genes, including for the first time their regulatory sequences, in a way that they can be delivered to cells to make proteins under correct regulation. This has not been possible before because the genetic sequence of the regulatory regions is much larger than the gene itself, sometimes ten times as large, and working with large pieces of DNA for gene expression is difficult. One way scientists deliver genes to cells is using viruses, which have evolved as powerful gene-delivery machines. However, most viruses are too small to carry the large pieces of DNA required when the regulatory elements are included. New advances in viral delivery methods using herpes simplex virus, a large DNA virus, now allow us to deliver genes with all their regulatory elements. The virus has been engineered into a gene delivery vector, which means all the disease-causing viral DNA has been removed, leaving only the bare minimum required for gene delivery. Now, the viral vector can carry human genes with all their large regulatory elements and deliver them efficiently to cultured cells in the laboratory. This breakthrough was originally made by Dr Wade-Martins in 2001 and the system is being continuously improved in his laboratory at Oxford University. In this project Dr Wade-Martins is teaming up with scientists from the Sanger Institute, at Hinxton just outside Cambridge, one of the largest centres for genetic studies in the world. Together they will be making a complete collection of all mouse genes, including their regulatory elements, in a format which can be used to look for new genes based on their function. These tests can take place in either mouse or human cells because the DNA sequences of humans and mice are extremely similar. The collection of genes is called a library, and will contain all the approximately 30,000 genes present in a mouse. The collection will be made available to the scientific community and will be an extremely valuable new resource. We will test the library by looking for genes involved in a disease called Fanconi's anaemia, a severe disease which causes cancer and bone-marrow failure. Previous attempts to find some of the genes for this disease have failed, and we believe our new approach has a high chance of success. We will also use the library to look for gene involved in the control of stem cells. Stem cells are cells which can become any cell type, and understanding how they are regulated by genes to become, for example, a brain cell, muscle cell or liver cell is one of the key questions in biology at the moment. Overall, this is an exciting new project bringing together two laboratories with similar interests in a area of research in which the UK is a world leader.

Technical Summary

The sequences of the human and mouse genomes offer an opportunity to understand and exploit genomic DNA. In particular, an efficient gene delivery system using genomic DNA loci with expression driven by the native promoter, flanked by regulatory regions and including introns would be extremely valuable. However, current vector systems usually employ strong heterologous promoters to drive gene expression. Bacterial artificial chromosomes (BACs) have proved to be excellent cloning systems for manipulating large genomic DNA inserts but the use of BAC vectors in gene expression studies is hampered by the difficulty of transferring intact sequences of genomic DNA >100 kb in cells. Infectious vectors are an efficient means of delivering genes to cells, but the size of most genomic loci generally precludes their use in the context of viral constructs. One of us (RWM) has recently pioneered an efficient expression system for genomic loci >100 kb, termed the infectious BAC, or iBAC. The iBAC is based on the Herpes simplex virus type 1 (HSV-1) amplicon vector. HSV-1 amplicons are excellent tools for infectious genomic DNA locus delivery because (i) HSV-1 has a high vector capacity of ~165 kb; (ii) high-titre amplicon stocks can be produced free from viral gene contamination by a helper virus-free packaging system; and, (iii) the resulting virion particles have a broad cell tropism across a wide range of species. RWM's laboratory has now expressed six genomic loci from iBAC vectors. We propose here a collaborative project to develop a new genomic DNA expression library based on iBAC technology for screening the whole genome for genes based on functional assays. The iBAC library vector is based on pBACe3.6, and includes: (i) the HSV-1 oriS and pac sequences to enable vector packaging; (ii) the EGFP reporter gene to track vector delivery; and, (iii) the EBNA-1/oriP episomal retention system from Epstein-Barr virus and hygromycin resistance for long term vector retention in selected clones. The iBAC library has now been constructed from mouse C3H DNA and contains 184,320 clones. Analysis of end-sequence data of 1,030 sampled clones showed the average insert size to be ~140 kb, optimising use of the high vector capacity, and providing 8.5-fold genome coverage. The first stage of iBAC library validation will be to screen the end-sequence library data in silico to obtain clones covering specific loci, and test them for function in established cellular assays. This ability to screen the iBAC library end-sequence database will immediately make the iBAC library a valuable resource to the research community. A minimum tiling-path of iBACs will be defined from end-sequence data so as to cover the entire genome in the smallest number of overlapping clones. A tiling path will be selected to minimise the number of clones in the tiling path, but also to maximise the number of whole genes included within at least one clone. We will then focus on the application of established vector technology to packaging the iBAC library. We will produce the minimum tiling-path iBAC library in two formats for screening: a pooled library and an arrayed library. The pooled library can be used to infect cells and select for a desired phenotype, for example, by fluorescence activated cell sorting, or cell survival. Arrayed libraries are more time consuming to screen, but provide a much more powerful platform owing to the diversity of assays possible in a multi-well dish. Finally we will undertake two genetic screens with the iBAC library. First, we will use a pooled library approach to identify novel Fanconi's anaemia genes; and second, we will screen a portion of the arrayed library for genes involved in the maintenance of stem cell pluripotency. The screens will take place in ES cells and lymphoblastoid cell lines, respectively, both cell types for which the conditions for HSV-1 amplicon vector delivery have been optimised in preliminary work.

Publications

10 25 50

publication icon
Evers B (2010) A high-throughput pharmaceutical screen identifies compounds with specific toxicity against BRCA2-deficient tumors. in Clinical cancer research : an official journal of the American Association for Cancer Research

publication icon
Lufino MM (2008) Advances in high-capacity extrachromosomal vector technology: episomal maintenance, vector delivery, and transgene expression. in Molecular therapy : the journal of the American Society of Gene Therapy

publication icon
Lufino MM (2011) Episomal transgene expression in pluripotent stem cells. in Methods in molecular biology (Clifton, N.J.)

publication icon
Markou A (2022) Molecular mechanisms governing aquaporin relocalisation. in Biochimica et biophysica acta. Biomembranes

publication icon
Salman MM (2022) Recent breakthroughs and future directions in drugging aquaporins. in Trends in pharmacological sciences

publication icon
Salman MM (2021) Advances in Applying Computer-Aided Drug Design for Neurodegenerative Diseases. in International journal of molecular sciences

 
Description We have generated, characterised and validated the first genomic DNA expression library suitable for genome-wide functional screens. Developing this resource has required bringing together two existing technologies, that of the high capacity bacterial artificial chromosome (BAC) cloning system and the high capacity herpes simplex virus type 1 (HSV-1) amplicon vector. The new technology, termed the infectious BAC, or iBAC, allows BAC clones to be delivered to cells for functional screens. Previously, iBAC methodology required the conversion of individual BAC library clones into HSV-1 amplicons, whereas now a full library of 184,000 clones will be available. The work has now been published and the project completed.
Exploitation Route The iBAC genomic DNA library will be suitable for use in functional genomics project in an industry or biotech setting. The iBAC genomic DNA library will be made widely available to the scientific community.
Sectors Pharmaceuticals and Medical Biotechnology

 
Description The genomic DNA expression tool we developed as part of the BBSRC grant continues to be used by my laboratory. The original vector system on which it is based also has broad applicability. The library has been made available at the Wellcome Truist Sanger Institute.
First Year Of Impact 2015
 
Title The iBAC genomic DNA expression library 
Description The iBAC genomic DNA expression library has been constructed and will be made fully available to the scientific community. 
Type Of Material Model of mechanisms or symptoms - in vitro 
Year Produced 2009 
Provided To Others? Yes  
Impact The iBAC library as described is in use in my group and by our collaborators at the Wellcome Trust Sanger Institute. 
 
Description Functional genomics 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution We have worked together on the characterisation of BAC vector expression in functional studies.
Collaborator Contribution Providing expertise in functional genomics and bioinformatics
Impact The manuscript describing the work is in preparation
Start Year 2006