The iBAC genomic DNA expression library

Lead Research Organisation: The Wellcome Trust Sanger Institute
Department Name: Research Directorate


The recent completion of projects to obtain the complete DNA sequence of all human genes, known as the human genome, has opened up a new era of biology. We now are able to propose a whole series of experiments which would previously not have been possible. The DNA sequence of our genes confirmed that the vast majority of our DNA sequence does not provide information to make proteins, but contains regulatory information to turn genes on and off, or to turn protein production up and down, in the right cells at the right time in development. This DNA sequence, called non-coding DNA, has previously been difficult to define and hard to study. For many years it has been possible to take a gene sequence out of a chromosome and use it to make, or 'express', a protein in cultured cells grown in the laboratory. However, it has not been easy before to take a gene from a chromosome together with all of its regulatory elements. Inside cells the control of gene expression is complex and if genes are used without their natural regulatory elements it is difficult to control when they make protein and how much they make. This project will make a collection of all genes, including for the first time their regulatory sequences, in a way that they can be delivered to cells to make proteins under correct regulation. This has not been possible before because the genetic sequence of the regulatory regions is much larger than the gene itself, sometimes ten times as large, and working with large pieces of DNA for gene expression is difficult. One way scientists deliver genes to cells is using viruses, which have evolved as powerful gene-delivery machines. However, most viruses are too small to carry the large pieces of DNA required when the regulatory elements are included. New advances in viral delivery methods using herpes simplex virus, a large DNA virus, now allow us to deliver genes with all their regulatory elements. The virus has been engineered into a gene delivery vector, which means all the disease-causing viral DNA has been removed, leaving only the bare minimum required for gene delivery. Now, the viral vector can carry human genes with all their large regulatory elements and deliver them efficiently to cultured cells in the laboratory. This breakthrough was originally made by Dr Wade-Martins in 2001 and the system is being continuously improved in his laboratory at Oxford University. In this project Dr Wade-Martins is teaming up with scientists from the Sanger Institute, at Hinxton just outside Cambridge, one of the largest centres for genetic studies in the world. Together they will be making a complete collection of all mouse genes, including their regulatory elements, in a format which can be used to look for new genes based on their function. These tests can take place in either mouse or human cells because the DNA sequences of humans and mice are extremely similar. The collection of genes is called a library, and will contain all the approximately 30,000 genes present in a mouse. The collection will be made available to the scientific community and will be an extremely valuable new resource. We will test the library by looking for genes involved in a disease called Fanconi's anaemia, a severe disease which causes cancer and bone-marrow failure. Previous attempts to find some of the genes for this disease have failed, and we believe our new approach has a high chance of success. We will also use the library to look for gene involved in the control of stem cells. Stem cells are cells which can become any cell type, and understanding how they are regulated by genes to become, for example, a brain cell, muscle cell or liver cell is one of the key questions in biology at the moment. Overall, this is an exciting new project bringing together two laboratories with similar interests in a area of research in which the UK is a world leader.

Technical Summary

The sequences of the human and mouse genomes offer an opportunity to understand and exploit genomic DNA. In particular, an efficient gene delivery system using genomic DNA loci with expression driven by the native promoter, flanked by regulatory regions and including introns would be extremely valuable. However, current vector systems usually employ strong heterologous promoters to drive gene expression. Bacterial artificial chromosomes (BACs) have proved to be excellent cloning systems for manipulating large genomic DNA inserts but the use of BAC vectors in gene expression studies is hampered by the difficulty of transferring intact sequences of genomic DNA >100 kb in cells. Infectious vectors are an efficient means of delivering genes to cells, but the size of most genomic loci generally precludes their use in the context of viral constructs. One of us (RWM) has recently pioneered an efficient expression system for genomic loci >100 kb, termed the infectious BAC, or iBAC. The iBAC is based on the Herpes simplex virus type 1 (HSV-1) amplicon vector. HSV-1 amplicons are excellent tools for infectious genomic DNA locus delivery because (i) HSV-1 has a high vector capacity of ~165 kb; (ii) high-titre amplicon stocks can be produced free from viral gene contamination by a helper virus-free packaging system; and, (iii) the resulting virion particles have a broad cell tropism across a wide range of species. RWM's laboratory has now expressed six genomic loci from iBAC vectors. We propose here a collaborative project to develop a new genomic DNA expression library based on iBAC technology for screening the whole genome for genes based on functional assays. The iBAC library vector is based on pBACe3.6, and includes: (i) the HSV-1 oriS and pac sequences to enable vector packaging; (ii) the EGFP reporter gene to track vector delivery; and, (iii) the EBNA-1/oriP episomal retention system from Epstein-Barr virus and hygromycin resistance for long term vector retention in selected clones. The iBAC library has now been constructed from mouse C3H DNA and contains 184,320 clones. Analysis of end-sequence data of 1030 sampled clones showed the average insert size to be ~140 kb, optimising use of the high vector capacity, and providing 8.5-fold genome coverage. The first stage of iBAC library validation will be to screen the end-sequence library data in silico to obtain clones covering specific loci, and test them for function in established cellular assays. This ability to screen the iBAC library end-sequence database will immediately make the iBAC library a valuable resource to the research community. A minimum tiling-path of iBACs will be defined from end-sequence data so as to cover the entire genome in the smallest number of overlapping clones. A tiling path will be selected to minimise the number of clones in the tiling path, but also to maximise the number of whole genes included within at least one clone. We will then focus on the application of established vector technology to packaging the iBAC library. We will produce the minimum tiling-path iBAC library in two formats for screening: a pooled library and an arrayed library. The pooled library can be used to infect cells and select for a desired phenotype, for example, by fluorescence activated cell sorting, or cell survival. Arrayed libraries are more time consuming to screen, but provide a much more powerful platform owing to the diversity of assays possible in a multi-well dish. Finally we will undertake two genetic screens with the iBAC library. First, we will use a pooled library approach to identify novel Fanconi's anaemia genes; and second, we will screen a portion of the arrayed library for genes involved in the maintenance of stem cell pluripotency. The screens will take place in ES cells and lymphoblastoid cell lines, respectively, both cell types for which the conditions for HSV-1 amplicon vector delivery have been optimised in preliminary work.


10 25 50
Description We have a library of iBACs. We are still working on using these for genetic screens.
Exploitation Route We are writing a paper. The library we made is available and others are using it.
Sectors Healthcare

Description We will soon publish our paper leader to future outputs.
First Year Of Impact 2014
Sector Pharmaceuticals and Medical Biotechnology
Impact Types Policy & public services