Grid compatible data management for Directed Evolution Experiments

Lead Research Organisation: University of Southampton
Department Name: Sch of Chemistry

Abstract

Directed evolution is a technique where scientists use the power of natural evolution to change the behaviour and activity of a protein in the laboratory. These experiments generate very large amounts of data and in many cases a large proportion of this data is discarded, or not available to other scientists. The fact that this data is unavailable makes it very difficult to compare different experiments and to make these experiments more efficient. We aim to develop a computer based system that can store and integrate large amounts of data from these experiments and make it available to other scientists over a high speed computer network called the GRID. This network will make it possible for scientists to share and examine data stored in many different places in a simple way. Our system will automatically take data from experiments and process it to make it available on the GRID. We will use this system in two experiments. In one experiment we are making a large library of mutant HIV enzymes. These will be tested for resistance to new drugs being developed in the chemistry department. By doing this will be able to test drugs for potential resistance problems before they start to be used. In the second experiment we will select variants of an enzyme that cleaves a specific type of polysaccharide with changed selectivity. The data from these experiments will be automatically processed by the system into a format that can be made available via the GRID and also allow other scientists to process the data in different ways. When we publish the results of these experiments there will be links from the published paper all the way to the raw data.

Technical Summary

Directed evolution is a powerful experimental technique for modifying the activity of proteins. These experiments generate large quantities of data that are generally not made available and in many cases are discarded. Lack of access to this data is a contributing factor to the lack of development of a theoretical underpinning in this field. The development of GRID compatible data management systems provides a means for storing and making available the raw data from experimental studies. We will develop a system for directed evolution experiments with the explicit aim of making it simple and useable as well as providing sufficient functionality to make it attractive as a tool for researchers in the area. The system will integrate and store raw data, information on library generation, validation data, as well as links to other structural and functional data. It will then process this and make the complete set of raw and processed data available via the GRID. We will use this system in two directed evolution experiments. In the first, a very large library of constructed mutants of HIV protease will be tested for activity in the presence of a range of inhibitors. These assays will be performed on partially purified protein in a microplate fluorescence assay. The second experiment will be to screen neutral libraries of E. coli beta-glucuronidase that we have previously generated for activity against a range of other glycosides in a colorimetric colony plate assay. Thus we will apply the system in experiments that span the most widely used formats for directed evolution.

Publications

10 25 50