Bioinformatics tools for plant genetic resources

Lead Research Organisation: University of Dundee
Department Name: College of Life Sciences

Abstract

Modern agriculture needs crop varieties with improved performance for the consumer (e.g. flavour, shape, texture etc) and the producer (e.g. high yield, resistance to pests), and reduced environmental impact (e.g. lower fertiliser or pesticide input). These developments are all possible using conventional breeding backed by modern biotechnology, without the need for genetically modified (GM) plants. These improved properties are found in 'genebanks', which are collections of thousands of plant samples taken from the wild or old crop varieties, together with the many cultivars resulting from decades of selective breeding around the World. The problem with harnessing this potentially useful biodiversity in future breeding programmes is working out which samples to use. The solution is to 'genetically fingerprint' every sample and take accurate measurements of all the useful properties mentioned above. These experiments can tell us in principle which plants are likely to carry potentially useful genes. However, this huge quantity of potentially useful information remains difficult to use (hundreds of measurements in thousands of samples means millions of data points), because improvements in computer databases to store, analyse and display the results have lagged behind our ability to do the lab experiments. This project proposes to bridge that gap by developing a powerful, versatile and accessible computer database and associated computational tools, which can be applied to data collected from crop plant genebanks, to identify promising plant samples for further experimental analysis. All of these computational resources will be freely available to the World's genetic resources community.

Technical Summary

Efficient utilisation of plant genetic resources (gene banks) requires versatile, powerful databases for storing, accessing and combining the wide variety of data that are becoming available in rapidly increasing amounts. We have developed a functioning database, GERMINATE, which can accommodate a wide variety of data types, from descriptive (morphology, geography) to molecular (DNA sequence, marker scores, map position etc.). We now seek funds to complete its development into an integrated data and analytical resource for the World's plant genetic resource community. Currently, the GERMINATE database stores passport and multi-crop descriptor data for every popular molecular marker type except SNP. We propose to extend this capability to include SNP data, in a format that is acceptable to the World's plant genetic resources and genomics communities. We will also deploy an ontology module, which will provide standard nomenclatures for phenotypes, developmental stages and mutant or disease ontologies for crop plants, allowing rational searching for these previously inaccessible characters. Additionally, we will increase the functionality of GERMINATE by greatly expanding the number of linked, web-accessible bioinformatic tools, including the existing suites STRUCTURE (for deducing and visualising the population structure of germplasm), TASSEL (for tree drawing and linkage disequilibrium estimation) and DIVA-GIS (for visualising geographical data associated with accessions). Also, a new set of tools will be designed and developed, including GERMANE (managing workflows of multiple, chained analytical routines), CORE (for management of genetic resources, including identification of core collections in response to user requirements), and NETWORK (analysing non-treelike evolution via introgression, using a marker model-based approach). Lastly, the GERMINATE web interfaces will be improved to allow easier and more powerful uploading, retrieving and analysis of the data.

Publications

10 25 50
 
Description The main output of our project is the Germinate 2 databases that constitute the Germinate project (http://bioinf.scri.ac.uk/germinate). The PostgresQL structure of the first Germinate database has been replaced by MySQL, allowing open source development by a larger user community. The database schema and visual interface scripts are fully publicly available (bioinf.scri.ac.uk/germinate/). Databases have been developed for pea, barley, potato and wheat to date (follow 'Projects' link at the above
Exploitation Route Databases for genetic resources (the genomics of biodiversity) are highly useful to any agency or company with interest in the genetic basis of biodiversity. Our database structure has been adopted by other plant genetic resources institutes worldwide and crop breeding companies.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education,Healthcare

URL http://bioinf.hutton.ac.uk/public/?page_id=159
 
Description Germinate databases are in use in many sites across the World - it has been a very successful database model
First Year Of Impact 2009
Sector Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Education,Environment
Impact Types Economic